Hi, I am Avnish Patel

Computer Vision Engineer

Turning Pixels into Perception

About Me

I am a Masters Robotics graduate, with focus on Computer Vision from Northeastern University with over three years of experience in Machine Learning and Computer Vision.

My expertise lies in building end-to-end pipelines using advanced 3D vision techniques like SLAM, Structure from Motion, Pose Estimation, and Depth Estimation, with research and industry experience.

I am passionate about leveraging 3D vision and deep learning to develop innovative solutions that bridge the gap between research and real-world applications.

LinkedIn GitHub Resume

Publications

Logarithmic Lenses: Exploring Log RGB Data for Image Classification

B. Maxwell, A. Patel

Computer Vision and Pattern Recognition Conference (CVPR), 2024

View Publication

Professional Experience

Systems Vision Engineer

Medtronic

Sept 2025 – Present

Boston, MA

•Designed a Pose Graph pipeline in GTSAM using Robot Kinematics and TensorRT based YOLOv8 vision keypoints to reduce the drift and track the instrument points accurately in Out-of-View scenarios
•Engineered a Arm Collision Avoidance System using Trimesh of the medical robot using URDF and Kinematics using RTI DDS publish-subcribe method for real time visualization in Open3D
•Established a secure bridge communication between Production and Engineering Robotic systems to effectively transfer Kinematics messages

Research Assistant - AirLab

Carnegie Mellon University

July 2025 – Nov 2025

Remote

•Integrated Relative Pose Graph Optimization in ROS2 in C++ using GTSAM Fixed-Lag Smoother in the IMU Preintegration module of the multi-modal IMU-LiDAR sensor fusion to reduce long term drift in SLAM
•Achieved 35.8% lower ATE and 52.5% RPE on the SubT-MRS Laurel Cavern dataset with Velodyne LiDAR
•Executed trajectory mapping using Livox LiDAR and IMU sensors on the Unitree G1 robot, applying a low-pass filter to mitigate IMU bias and enhance mapping accuracy
•Built a Thermal-Intertial Odometry system with uncertainty aware-feature weighting for pose estimation in CART dataset
•Enhanced an Uncertainty SLAM system by integrating Voxel Maps, resulting in improved localization robustness and accuracy in challenging environments

Surgical R&T Machine Learning Engineer

Medtronic

Jan 2024 – Apr 2025

Boston, MA

•Built an end-to-end SLAM pipeline with DROID-SLAM for dense depth estimation in surgical videos, optimizing camera trajectory using GTSAM and refining 3D reconstruction with Bundle Adjustment and LightGlue feature matching
•Developed a real-time Ground Truth pose estimation pipeline using OptiTrack camera capture and robot kinematics with PnP and ROMA feature detection for training deep learning models on instrument articulation
•Performed Semantic Segmentation for, Robot-Assisted Surgery, on 10,000 medical images from S3 bucket, containing both mask and line annotations, to segment hernia using the Swin Base Transformer
•Developed a YOLOv8-based pipeline for precise detection of surgical instrument tips from medical images in real-time
•Applied Monocular Depth Estimation to get metric distance between two instruments from an image by Depth Anything
•Implemented a PyTorch wrapper with Optical Flow on FAST API using Unimatch, converting models to ONNX and TensorRT for 10x reduction in real time annotation of medical image frames with 1-second latency

Research Student - Computer Vision

Northeastern University

May 2023 – Dec 2023

Boston, MA

•Researched raw log RGB data's impact on deep networks like ResNet-18, improving classification performance and robustness to intensity and color variations
•Codeveloped the novel RAW10 dataset (10k DNG & JPG images each, 10 categories) to advance LOG RGB research in Computer Vision community
•Published CVPR 2024 paper on this research

Artificial Intelligence Engineer

Kisan Drip Irrigation Pvt Ltd

Aug 2020 – Aug 2022

India

•Integrated ElasticFusion: RGB-D SLAM with C++ to align multi-view point clouds from Intel RealSense D455 cameras,enabling accurate 3D Reconstruction for pipe inspection and defect analysis
•Experimented with PointNet-based deep learning models in Python for point cloud classification to enhance complex defect identification, achieving a 30% improvement over traditional 2D vision methods
•Implemented YOLO-based object detection to localize drippers in pipe assemblies, enabling precise hole punching
•Deployed the 3D vision pipeline as a containerized FastAPI service integrated into on-premises manufacturing workflows

Education

Northeastern University

Master of Science in Robotics ( specialization in Computer Vision)

GPA: 3.6/4.0

September 2022 – December 2024

Boston, MA

Relevant Coursework:

Robotics Sensing & Navigation, Advanced Computer Vision, Autonomous Field Robotics

Featured Projects

Technical Skills

Programming

PythonC++CUDA

Frameworks & Libraries

PyTorchOpenCVOpen3DVTKROS2GTSAMG2OOptiTrack SDK

Computer Vision & Deep Learning

Semantic SegmentationObject DetectionMonocular Depth EstimationPose Estimation3D ReconstructionSLAM Bundle Adjustment

Simulation & Visualization

CARLAGazeboRVizThree.js

Cloud & Deployment

AWS S3ONNX RuntimeNVIDIA TensorRTPyTorch DDPFastAPI

Get In Touch

Contact Information

avnishp@andrew.cmu.edu

Location

Boston, USA

Hi, I am Avnish Patel

About Me

Publications

Logarithmic Lenses: Exploring Log RGB Data for Image Classification

Professional Experience

Systems Vision Engineer

Research Assistant - AirLab

Surgical R&T Machine Learning Engineer

Research Student - Computer Vision

Artificial Intelligence Engineer

Education

Northeastern University

Relevant Coursework:

Featured Projects

Technical Skills

Programming

Frameworks & Libraries

Computer Vision & Deep Learning

Simulation & Visualization

Cloud & Deployment

Get In Touch

Contact Information

Connect With Me