profile photo

Pranay Mathur

Google Scholar  |  Experience  |  Publications  |  Projects  |  Awards  |  GSoC

I'm a Machine Learning Engineer at Cobot where I work on developing perception algorithms for our robot - Proxie. My mission is to develop multi-modal foundation models and perception pipelines that act as robust representational priors for generally capable agents.

I graduated from Georgia Tech, where I worked on scaling generalizable bimanual manipulation policies using ego-centric human play data advised by Danfei Xu. I have an undergraduate degree in electronics from BITS Pilani where I was the recipient of the Innovator of the Year Award. As part of my undergraduate thesis, I worked on resource-aware visual-inertial odometry algorithms at the Autonomous Robots Lab advised by Kostas Alexis.

Prior to my current role, I worked at MathWorks in the computer-vision team developing vision-foundation models. As part of the Google Summer of Code '22 program, I worked as an open-source developer on the project Landmark Mapping using Quantized EfficientDet on Edge TPUs. I've spent time at MathWorks, Addverb Technologies and KPIT working on developing software for robots and self-driving cars.

I am an aviation enthusiast and love building and flying quadcopters. Apart from aviation, I love cycling, reading, playing football, amateur photography and sketching.

Feel free to drop me an e-mail if you want to chat with me!

 ~  Email  |  Resume  |  Github  |  LinkedIn  |  Twitter  |  Art  ~ 


EgoMimic: Scaling Imitation Learning via Egocentric Video
[Paper] [Website]
Simar Kareer, Dhruv Patel*, Ryan Punamiya*, Pranay Mathur*, Shuo Cheng, Chen Wang, Judy Hoffman, Danfei Xu
Accepted to IEEE Internatinal Conference on Robotics and Automation (ICRA) 2025
X-Embodiment Workshop, Conference on Robot Learning (CoRL)
2024
Neural Visibility Field for Uncertainty-Driven Active Mapping
[Paper] [Website]
Shangjie Xue, Jesse Dill, Pranay Mathur, Frank Dellaert, Panagiotis Tsiotras, Danfei Xu
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024
Proactive Human-Robot Interaction using Visuo-Lingual Transformers and Object Interaction Graphs
[Paper] [Video] [Poster] [Best Paper Award]
Pranay Mathur
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Geriatronics AI Workshop 2023
Sparse Image-based Navigation Architecture to Mitigate the need of precise Localization in Mobile Robots
[Paper]
Pranay Mathur, Rajesh Kumar, Sarthak Upadhyay,
arXiv 2022
Resource-aware Online Parameter Adaptation for Computationally-constrained Visual-Inertial Navigation Systems
[Paper] [Video]
Pranay Mathur, Nikhil Khedekar, Kostas Alexis,
IEEE-RAS International Conference on Advanced Robotics 2021
A Generalized Kalman Filter Augmented Deep-Learning based Approach for Autonomous Landing in MAVs
[Paper] [Code] [Best Paper Award]
Pranay Mathur, Yash Jangir, Neena Goveas,
IEEE International Symposium of Asian Control Association on Intelligent Robotics and Industrial Automation 2021
Multi-Sensor Fusion-Based Object Detection Implemented on ROS
[Paper] [Code]
Pranay Mathur, Ravish Kumar, Rahul Jain,
Springer International Conference on Machine Learning and Autonomous Systems 2021
BCI Controlled Quadcopter using SVM and Recursive LSE Implemented on ROS
[Paper] [Code]
Kshitij Chhabra, Pranay Mathur, Veeky Baths,
IEEE International Conference on Systems, Man and Cybernetics 2020

Behavior-Cloning: A comparison of MLP, RNN and Diffusion-based policies for LfD

[Code]

  • Implemented MLP, RNN and Diffusion policy variants for behavior cloning from diverse demonstration datasets
  • Evaluated the effectiveness of these policies in learning robust policies on 3 datasets of varying difficulty using Robosuite and Robomimic
  • EAM: Embodiment Agnostic Long-Horizon Manipulation using Human-Play Data

    [Report] [Webpage]

  • We propose learning a policy using human-play data - trajectories of humans freely interacting with their environment.
  • To bridge the embodimet gap, we experimented with multiple techniques such as aligning visual representations from DINOV2 + LoRA (Rein), mimimizing auxiliary KL Divergence loss, masking the manipulator, co-training with robot play data
  • Object Manipulation using learnt Dynamics Models

    [CS 8803: Report] [Code]

  • Implemented data-driven approaches for learning dynamics models and predicting future states
  • Used random-shooting with gradient-based trajectory optimization to improve success rate
  • Human-Motion Prediction: With great power comes great res-pose-ability

    [Project Report] [Code]

  • Worked on Convolutional Seq-to-Seq models for human-motion prediction on computationally-constrained systems
  • Achieved comparable performance to several baselines implemented in the fairmotion library at reduced computational costs
  • Perception Open3d

    [ROS World Lightning Talk] [Code]

  • Developed a ROS wrapper for conversion of Open3d pointclouds to ROS pointclouds and vice-versa
  • Released two packages, open3d_conversions and open3d_conversions_examples that are part of ROS-perception
  • MAV navigation in visually-degraded GPS denied environments using RTAB-Map SLAM

    [Code]

  • Developed an algorithm for autonomous navigation of RGBD camera equipped MAVs in GPS-denied environments using RTAB-Map V-SLAM
  • Developed custom computer vision algorithms using CNN based attention maps for obstacle recognition and avoidance implemented in Tensorflow and OpenCV
  • Selected for funding by the EEE Dept. and Sandbox Fabrication Lab, BITS Goa
  • Project Kratos - Mars Rover

    [Website]

  • Contributed in building a Mars Rover that ranked 10 th of 25 teams in the Indian Rover Challenge
  • Lead the communication sub-system and implemented a scheduling algorithm to transmit multiple camera and data feeds with minimal latency
  • Set up Communication Networks using the Ubiquiti Networks Platform and automated processes using BASH scripting in Linux

  • Machine Learning Engineer, Perception | Collaborative Robotics
    Jan '25 - present

    • Devloping infrastructure to help scale data annotation, model training, monitoring, evaluation and inference.

    Engineering Development Group Engineer | MathWorks
    June '24 - Jan '25

    • Worked with the Image Processing and Computer-Vision team on importing vision-foundation models.
    • Optimized model inference time by 10x through custom attention layer implementation.

    Engineering Development Group Intern | MathWorks
    May '23 - Aug' 23

    • Worked on feature enhancements in the Simulink Test Toolbox for the 2024a release
    • Resolved performance issues in C++ and MATLAB back-end in pre-existing features

    Open-Source Developer | Google Summer of Code
    June '22 - Sept'22

    • Used OpenCV and EfficientDet to identify a track for an autonomous vehicle to follow [Project] [GitHub] [Video]
    • Quantize and port model to TFlite for inference on Coral USB Accelerator

    Graduate Engineer Trainee | Addverb Technologies
    August '21 - July '22

    • Worked on appearance-based Navigation of ground-based robots using semantic-scene understanding
    • Integrated autonomous mobile-robots with 5G cloud-control based capabilities
    • Deployed localization scoring and recovery method for augmenting LiDAR based SLAM in mobile robots

    Intern | Technoyantra
    January '21 - August '21

    • Developed a localization algorithm using EKF-based fusion of pose estimates from fiducial tags and LIDAR-based SLAM [GitHub]
    • Implemented pipelines in ROS2 for Point Cloud segmentation, statistical outlier removal and voxel filtering

    Undergraduate Researcher | Autonomous Robots Lab
    July '20 - January '21

    • Developed a generalizable Resource-Aware algorithm for deployment of Visual Inertial Odometry algorithms on aerial vehicles with computationally constrained onboard systems under the guidance of Prof. Kostas Alexis [Video] [Paper]
    • Contributed to Intel ISL Open3D and released two packages incorporated into ROS-perception.
    • Presented the packages as a Lightning Talk at ROS-World 2020. [GitHub] [Video]

    Technical Intern | KPIT
    May '20 - July '20

    • Worked on Object Detection based on multi-modal sensor fusion using 3D LIDAR, monocular camera and a RADAR [Paper]
    • Developed a novel algorithm for the detection of vehicles and pedestrians Implemented using TensorFlow and ROS [GitHub]

    Research Intern | CSIR-Central Electronics Engineering Research Institute
    May '19 - July '19

    • Implemented RTAB-Map SLAM for Autonomous Navigation of Quadcopters using PX4 and ROS in visually-degraded GPS-denied environments using an RGBD camera
    • Implemented multi-modal sensor fusion and image noise-removal through classical image processing pipelines under the guidance of Dr. S. A. Akbar, Chief Scientist CEERI Pilani, India

    Research Assistant | BITS Pilani
    Aug '19 - Dec '19

    • Worked with Prof Neena Goveas on a supervised approach for MAV landing using an EKF. Published in IEEE IRIA 2021 and received the Best Paper Award.
    • Developed an SVM-based EEG classification approach to control a quadcopter for 3D reconstruction and exploration advised by Prof Veeky Baths. Published in IEEE SMC 2020.



    Click here to view my awards and media spotlight



    This website is a modification of Jon Barron's website. You can find the source code here