Dhruv Patel

About me

I am an MS Robotics student at the School of Interactive Computing, Georgia Tech. I currently conduct research on Cross-Embodiment Learning for Robot Manipulation from Human-Play data at the Robot Learning and Reasoning (RL2) lab with Dr. Danfei Xu . Presently, I am working on extending our recent work, EgoMimic (CoRL 2024) for diverse multi-task settings and various axes of generalization (behavior, scene), exploring Vision-Language Models (VLMs) and large-scale embodied human datasets. My interests broadly lie at the intersection of Robotics, Computer Vision and Deep Learning.

UPDATE: I am actively seeking full-time opportunities in AI/Robotics starting May 2025 — feel free to reach out if there's a fit!

I spent summer 2024 interning at Honda Research Institute, USA, working on Scene Understanding for Autonomous Driving at Intersections. Before Georgia Tech, I explored open-source software development as a Google Summer of Code'23 contributor at Unify AI, with an aim to optimize SLAM for real-world deployment in Robotics applications. Prior to this, I was a Project Associate at the Robotics Research Centre (RRC), where I worked on Scene Understanding for Autonomous Driving in Adverse Weather Conditions (affiliated with Queensland University of Technology (QUT) and ZF Group, and spearheaded the IHub Project Mobility on UAV-based Visual Remote Sensing for Civil Infrastructure Safety Assessment. I spent the summer of 2020 working on Simultaneous Localization and Mapping (SLAM) for Level-5 Autonomy at Swaayatt Robots. Post this, I transitioned to a Software Engineering role at Amdocs, and alongside, collaborated with the Norwegian Biometrics Laboratory (NTNU, Norway) to conduct research on Image Super-Resolution problem.

I am always open to collaborations, research discussions, or just an interesting chat on AI & Robotics. Feel free to connect with me on LinkedIn or via email!

Interests

Robotics & Computer Vision
Deep Learning
AI & Neuroscience

Education

MS in Robotics, August 2023- May 2025

Georgia Institute of Technology (Georgia Tech)
B.Tech in Electronics & Communication Engg, July 2016 - July 2020

Sardar Vallabhbhai National Institute of Technology, Surat

Recent

May '25: Extended EgoMimic to study behavior and scene-level generalization from human data. Also, a part of active collaborative effort on collecting large-scale embodied human dat for robot learning (More details soon!)
Feb '25: Meta AI did a story covering our work - EgoMimic. Check it out here. EgoMimic is also accepted at ICRA 2025
Jan '25: Starting as Graduate Teaching Assistant for CS 3630: Intro to Perception and Robotics.
Nov '24: Presented EgoMimic at Conference on Robot Learning (CoRL) 2024 (Munich, Germany)
August '24: Graduate Teaching Assistant for CS6476: Computer Vision
May '24: Summer Intern at Honda Research Institute, USA.
Jan '24: Graduate Teaching Assistant for CS4476: Intro to Computer Vision.
Nov '23: Started working with Prof. Danfei Xu at the Robot Learning and Reasoning Lab (RL2), Georgia Tech.
August '23: Started MS in Robotics at Georgia Tech! Graduate Teaching Assistant for CS6476: Computer Vision.
May '23: Proposal accepted at Google Summer of Code (GSoC)!
May '23: Presented GDIP at ICRA 2023.
Nov '22: Presented SRTGAN at CVIP 2022.
Oct '22: Presented UVRSABI as a spotlight paper at CVCIE workshop, ECCV 2022.
Sept '22: UVRSABI released as an open-source software at IHub Data Mobility Summit 2022 and would be used for deployment of civil inspection by CRRI (Govt. of India).

Work Experience

Graduate Student Researcher

Robot Learning and Reasoning Lab (RL2), Georgia Tech

Nov 2023 - Present

Advisor: Dr. Danfei Xu
Cross-Embodiment Learning for Robot Manipulation using Embodied Human-Play Data [WebPage]

Keywords: Robot Learning, Manipulation, Imitation Learning

Research Associate Intern

Honda Research Institute, USA

May 2024 - August 2024

Developed perception algorithms for intersection detection and navigation for HRI’s Autonomous Vehicle (AV) platform.

Keywords: Scene Understanding, ADAS, Robotics, Deep Learning

Open-Source Software Developer

Google Summer of Code 2023

June 2023 - August 2023

Multi-backend framework support of GradSLAM in Ivy [WebSite]

Google Summer of Code '23 Contributor at Ivy - unify.ai
Developed a multi-backend framework support (PyTorch, JAX, NumPy, Tensorflow) for GradSLAM library in Ivy, with an aim to optimize deployment through highly efficient frameworks like JAX.

Keywords: Robotics, Deep Learning, PyTorch, JAX, NumPy, Tensorflow

Project Associate

Robotics Research Centre (RRC), IIIT Hyderabad

July 2021 – July 2023

Scene Understanding for Autonomous Driving

Advisors: Prof. Madhava Krishna and Dr. Sourav Garg

Collaborated with ZF Friedrichshafen group and Queensland University of Technology (QUT) Robotics on improving perception and scene understanding for adverse weather conditions.
Proposed GDIP: Gated Differentiable Image Processing which establishes a new SOTA for object detection in foggy and low-lighting conditions.
Researched downstream problems like video object detection/tracking and explored Probabilistic Graphical Models (PGMs) for weather-agnostic feature refinement.

UAV-based Visual Remote Sensing for Automated Building Inspection (UVRSABI)

Advisors: Prof. Madhava Krishna, Dr. Ravi Kiran, and Dr. Harikumar Kandath

Automated assessment of civil structures with the help of visual remote sensing.
Utilized Structure-from-Motion, state estimation, odometry, etc., in conjunction with classical Computer Vision and Deep Learning-based visual inspection algorithms to robustly estimate critical structural parameters.
Developed and released an open-source library (UVRSABI) for the community. More details here.

Keywords: Robotics, Computer Vision, Deep Learning, Reinforcement Learning, 3D Reconstruction, Autonomous Driving, ADAS, UAVs

Software Engineer

AMDOCS

Aug 2020 – June 2021 Pune

Scrum Master/Team Lead: Shreyas Kulkarni

Responsible for B2B production-level full-stack software development.
Developed cross-functional telecom software solutions for Comcast's Orion project (USA).
Technical Stack: Java, ReactJS, SQL, Spring Boot, Maven, and Jenkins.

Keywords: Java, SQL, ReactJS, Object-oriented Programming, Microservices, Jenkins, Maven, Spring

Research Intern

Swaayatt Robots

April 2020 – July 2020

Advisor: Sanjeev Sharma (Founder & CEO - Swaayatt Robots)

Improved Visual Odometry and SLAM pipelines for Level-5 Autonomy.
Devised a semantic variant of the Iterative Closest Point (ICP) algorithm, outperforming vanilla ICP in terms of matching loss and convergence time on the Semantic KITTI dataset.
Developed a low-level C++ library.

REPORT PDF

Keywords: Robotics, Mathematical Optimization, SLAM, ICP, LiDARs

Deep Learning Intern

Sardar Vallabhbhai National Institute Of Technology, Surat

May 2019 - July 2019

Advisor: Dr. Kishor Upla (Assistant Professor, ECED)

Implemented the state-of-the-art FaceNet paper and validated it on a custom dataset of 25 students.

Keywords: Face Recognition, Deep Learning

Publications

EgoMimic: Scaling Imitation Learning via Egocentric Videos

Simar Kareer, Dhruv Patel*, Ryan Punamiya*, Pranay Mathur*, Shuo Cheng, Chen Wang, Judy Hoffman, Danfei Xu
Conference on Robot Learning (CoRL) 2024 Workshops: Learning Fine/Dexterous Manipulation and X-Embodiment

PDF Code Website Video Hardware Datasets

GDIP: Gated Differentiable Image Processing for Object-Detection in Adverse Conditions

Sanket Kalwar*, Dhruv Patel*, Aakash Aanegola, Krishna Reddy Konda, Sourav Garg, K. Madhava Krishna
International Conference on Robotics and Automation (ICRA) 2023

PDF Code Website

UAV-based Visual Remote Sensing for Automated Building Inspection

Kushagra Srivastava*, Dhruv Patel*, Aditya Kumar Jha, Mohhit Kumar Jha, Jaskirat Singh, Ravi Kiran Sarvadevabhatla, Pradeep Kumar Ramancharla, Harikumar Kandath, K. Madhava Krishna
Spotlight paper at European Conference on Computer Vision (ECCV) Workshop 2022

PDF Code Website Documentation

SRTGAN: Triplet Loss based Generative Adversarial Network for Real-World Super-Resolution

Dhruv Patel*, Abhinav Jain*, Simran Bawkar, Manav Khorasiya, Kalpesh Prajapati, Kishor Upla, Kiran Raja, Raghavendra Ramachandra, Christoph Busch
7th International Conference on Computer Vision & Image Processing (CVIP) 2022

PDF Code Website

Design of an Autonomous Agricultural Robot for Real-Time Weed Detection using CNN

Dhruv Patel*, Meet Gandhi*, Shankaranarayanan H.*, Anand Darji
AVES 2021 conference

PDF Code Website

Dhruv Patel

About me

Interests

Education

Recent

Work Experience

Graduate Student Researcher

Robot Learning and Reasoning Lab (RL2), Georgia Tech

Research Associate Intern

Honda Research Institute, USA

Open-Source Software Developer

Google Summer of Code 2023

Project Associate

Robotics Research Centre (RRC), IIIT Hyderabad

Software Engineer

AMDOCS

Research Intern

Swaayatt Robots

Deep Learning Intern

Sardar Vallabhbhai National Institute Of Technology, Surat

Publications

Projects

Zero-shot policy adaptation: Diffusion Models + LLMs

EgoMimic: Scaling Imitation Learning via Egocentric Video

Asia-Pacific Robotics Contest 2018 & 2019

Behavior Cloning and Dynamics Models for Robot Manipulation

Vision-Language Models for Dense Feedback Reward

UAV-based Assessment of Civil Structures

Obstacle Avoidance for UAV

Fytbuddy: A real-time gym fitness trainer

Autonomous Agricultural Robot

RFID System

Image Super-Resolution

Mapping for level-5 Autonomy

Object Detection in Adverse weather setting

Implementation of Path Searching/Tracking algorithms

Teaching

Featured