Hi, I am Avnish Patel

Computer Vision Engineer

Turning Pixels into Perception

Avnish Patel

About Me

I am a Masters Robotics graduate, with focus on Computer Vision from Northeastern University with over three years of experience in Machine Learning and Computer Vision.

My expertise lies in building end-to-end pipelines using advanced 3D vision techniques like SLAM, Structure from Motion, Pose Estimation, and Depth Estimation, with research and industry experience.

I am passionate about leveraging 3D vision and deep learning to develop innovative solutions that bridge the gap between research and real-world applications.

Publications

Logarithmic Lenses: Exploring Log RGB Data for Image Classification

B. Maxwell, A. Patel

Computer Vision and Pattern Recognition Conference (CVPR), 2024

View Publication

Professional Experience

Systems Vision Engineer

Medtronic

Sept 2025 – Present

Boston, MA

  • Designed a Pose Graph pipeline in GTSAM using Robot Kinematics and TensorRT based YOLOv8 vision keypoints to reduce the drift and track the instrument points accurately in Out-of-View scenarios
  • Engineered a Arm Collision Avoidance System using Trimesh of the medical robot using URDF and Kinematics using RTI DDS publish-subcribe method for real time visualization in Open3D
  • Established a secure bridge communication between Production and Engineering Robotic systems to effectively transfer Kinematics messages

Research Assistant - AirLab

Carnegie Mellon University

July 2025 – Nov 2025

Remote

  • Integrated Relative Pose Graph Optimization in ROS2 in C++ using GTSAM Fixed-Lag Smoother in the IMU Preintegration module of the multi-modal IMU-LiDAR sensor fusion to reduce long term drift in SLAM
  • Achieved 35.8% lower ATE and 52.5% RPE on the SubT-MRS Laurel Cavern dataset with Velodyne LiDAR
  • Executed trajectory mapping using Livox LiDAR and IMU sensors on the Unitree G1 robot, applying a low-pass filter to mitigate IMU bias and enhance mapping accuracy
  • Built a Thermal-Intertial Odometry system with uncertainty aware-feature weighting for pose estimation in CART dataset
  • Enhanced an Uncertainty SLAM system by integrating Voxel Maps, resulting in improved localization robustness and accuracy in challenging environments

Surgical R&T Machine Learning Engineer

Medtronic

Jan 2024 – Apr 2025

Boston, MA

  • Built an end-to-end SLAM pipeline with DROID-SLAM for dense depth estimation in surgical videos, optimizing camera trajectory using GTSAM and refining 3D reconstruction with Bundle Adjustment and LightGlue feature matching
  • Developed a real-time Ground Truth pose estimation pipeline using OptiTrack camera capture and robot kinematics with PnP and ROMA feature detection for training deep learning models on instrument articulation
  • Performed Semantic Segmentation for, Robot-Assisted Surgery, on 10,000 medical images from S3 bucket, containing both mask and line annotations, to segment hernia using the Swin Base Transformer
  • Developed a YOLOv8-based pipeline for precise detection of surgical instrument tips from medical images in real-time
  • Applied Monocular Depth Estimation to get metric distance between two instruments from an image by Depth Anything
  • Implemented a PyTorch wrapper with Optical Flow on FAST API using Unimatch, converting models to ONNX and TensorRT for 10x reduction in real time annotation of medical image frames with 1-second latency

Research Student - Computer Vision

Northeastern University

May 2023 – Dec 2023

Boston, MA

  • Researched raw log RGB data's impact on deep networks like ResNet-18, improving classification performance and robustness to intensity and color variations
  • Codeveloped the novel RAW10 dataset (10k DNG & JPG images each, 10 categories) to advance LOG RGB research in Computer Vision community
  • Published CVPR 2024 paper on this research

Artificial Intelligence Engineer

Kisan Drip Irrigation Pvt Ltd

Aug 2020 – Aug 2022

India

  • Integrated ElasticFusion: RGB-D SLAM with C++ to align multi-view point clouds from Intel RealSense D455 cameras,enabling accurate 3D Reconstruction for pipe inspection and defect analysis
  • Experimented with PointNet-based deep learning models in Python for point cloud classification to enhance complex defect identification, achieving a 30% improvement over traditional 2D vision methods
  • Implemented YOLO-based object detection to localize drippers in pipe assemblies, enabling precise hole punching
  • Deployed the 3D vision pipeline as a containerized FastAPI service integrated into on-premises manufacturing workflows

Education

Northeastern University

Master of Science in Robotics ( specialization in Computer Vision)

GPA: 3.6/4.0

September 2022 – December 2024

Boston, MA

Relevant Coursework:

Robotics Sensing & Navigation, Advanced Computer Vision, Autonomous Field Robotics

Featured Projects

Technical Skills

Programming

PythonC++CUDA

Frameworks & Libraries

PyTorchOpenCVOpen3DVTKROS2GTSAMG2OOptiTrack SDK

Computer Vision & Deep Learning

Semantic SegmentationObject DetectionMonocular Depth EstimationPose Estimation3D ReconstructionSLAM Bundle Adjustment

Simulation & Visualization

CARLAGazeboRVizThree.js

Cloud & Deployment

AWS S3ONNX RuntimeNVIDIA TensorRTPyTorch DDPFastAPI

Get In Touch

Contact Information

Location

Boston, USA

Connect With Me