Dhruv Patel

Dhruv Patel

About me

I am an MS Robotics student at the School of Interactive Computing, Georgia Tech. I currently conduct research on Cross-Embodiment Learning for Robot Manipulation from Human-Play data at the Robot Learning and Reasoning (RL2) lab with Dr. Danfei Xu My interests broadly lie at the intersection of Robotics, Computer Vision and Deep Learning.

I spent summer 2024 interning at Honda Research Institute, USA, working on Scene Understanding for Autonomous Driving at Intersections. Before Georgia Tech, I explored open-source software development as a Google Summer of Code'23 contributor at Unify AI, with an aim to optimize SLAM for real-world deployment in Robotics applications. Prior to this, I was a Project Associate at the Robotics Research Centre (RRC), where I worked on Scene Understanding for Autonomous Driving in Adverse Weather Conditions (affiliated with Queensland University of Technology (QUT) and ZF Group, and spearheaded the IHub Project Mobility on UAV-based Visual Remote Sensing for Civil Infrastructure Safety Assessment.

I spent the summer of 2020 working on Simultaneous Localization and Mapping (SLAM) for Level-5 Autonomy at Swaayatt Robots. Post this, I transitioned to a Software Engineering role at Amdocs, and alongside, collaborated with the Norwegian Biometrics Laboratory (NTNU, Norway) to conduct research on Image Super-Resolution problem.

I always look forward to interesting collaborations or chats on AI. Feel free to ping me on LinkedIn or through email.

Interests

  • Robotics & Computer Vision
  • Deep Learning
  • AI & Neuroscience

Education

  • MS in Robotics, August 2023- May 2025

    Georgia Institute of Technology (Georgia Tech)

  • B.Tech in Electronics & Communication Engg, July 2016 - July 2020

    Sardar Vallabhbhai National Institute of Technology, Surat

Recent

Work Experience

 
 
 
 
 

Graduate Student Researcher

Robot Learning and Reasoning Lab (RL2), Georgia Tech

Jan 2024 - Present
Advisor: Dr. Danfei Xu
  • Cross-Embodiment Learning for Robot Manipulation from Human-Play Data [WebPage]
  • Keywords: Scene Understanding, ADAS, Robotics, Deep Learning,
     
     
     
     
     

    Research Associate Intern

    Honda Research Institute, USA

    May 2024 - August 2024

    • Developed perception algorithms for intersection detection and navigation for HRI’s Autonomous Vehicle (AV) platform.
    Keywords: Scene Understanding, ADAS, Robotics, Deep Learning,
     
     
     
     
     

    Open-Source Software Developer

    Google Summer of Code 2023

    June 2023 - August 2023

      Multi-backend framework support of GradSLAM in Ivy [WebSite]

      • Google Summer of Code'23 Contributor at Ivy - unify.ai
      • Developed a multi-backend framework support (PyTorch, JAX, NumPy, Tensorflow) for GradSLAM library in Ivy, with an aim to optimize deployment through highly efficient frameworks like JAX.
    Keywords: Robotics, Deep Learning, PyTorch, JAX, NumPy, Tensorflow
     
     
     
     
     

    Project Associate

    Robotics Research Centre(RRC), IIIT Hyderabad

    July 2021 – July 2023

      Scene Understanding for Autonomous Driving
      Advisors: Prof. Madhava Krishna and Dr. Sourav Garg

      • Collaborated with ZF Friedrichshafen group and Queensland University of Technology (QUT) Robotics on improving scene understanding and exploring object detection/tracking, segmentation for weather-agnostic setting.
      • Proposed GDIP: Gated Differentiable Image Processing (GDIP) for Object-Detection in Adverse Conditions, which establishes a new SOTA for foggy and low-lighting.
      • Researched downstream problems like video object detection/tracking in adverse weather setting exploring various spatio-temporal techniques. Also, explored Probabilistic Graphical Models (PGMs) for weather-agnostic feature refinement


      DodgeDrone: Vision-based Agile Drone Flight [WebPage]
      Advisors: Prof. Madhava Krishna

      • Devising a high-level control strategy for obstacle avoidance through Imitation Learning in conjunction with Reinforcement Learning.


      UAV-based Visual Remote Sensing for Automated Building Inspection (UVRSABI)
      Advisors: Prof. Madhava Krishna, Dr. Ravi Kiran and Dr. Harikumar Kandath

      • Automated assessment of civil structures with the help of visual remote sensing.
      • Utilized Structure-from-motion, state estimation, odometry etc. in conjunction with classical Computer Vision and Deep Learning-based visual inspection algorithms to robustly estimate critical structural parameters.
      • Developed and released an open-source library (UVRSABI) for the community. More details here.
    Keywords: Robotics, Computer Vision, Deep Learning, Reinforcement Learning, 3D Reconstruction, Autonomous Driving, ADAS, UAVs
     
     
     
     
     

    Software Engineer

    AMDOCS

    Aug 2020 – June 2021 Pune
    Scrum Master/Team Lead: Shreyas Kulkarni
  • Responsible for B2B production-level full-stack software development.
  • Developed cross-functional telecom software solutions for Comcast's Orion project (USA).
  • Collaborated with global product owners, ensuring end-to-end feature development, integration and validation with testing team.
  • Technical Stack: Java, ReactJS, SQL, Spring Boot, Maven, and Jenkins.
  • Keywords: Java, SQL, ReactJS, Object-oriented Programming, Microservices, Jenkins, Maven, Spring
     
     
     
     
     

    Research Intern

    Swaayatt Robots

    April 2020 – July 2020
    Advisor: Sanjeev Sharma (Founder & CEO - Swaayatt Robots)
  • Improved Visual Odometry and SLAM pipelines for Level-5 Autonomy.
  • Devised a semantic variant of the Iterative Closest Point (ICP) algorithm outperforming vanilla ICP in terms of matching loss and convergence time, respectively, on the Semantic KITTI dataset.
  • Developed a low-level C++ library.
  • Keywords: Robotics, Mathematical Optimization, SLAM, ICP, LiDARs
     
     
     

    Deep Learning Research Intern

    Sardar Vallabhbhai National Institute Of Technology, Surat

    May 2019 - July 2019
    Advisor : Dr. Kishor Upla (Assistant Professor, ECED)
  • Implemented state-of-the-art FaceNet paper and validated it on a custom made dataset of 25 students.
  • Keywords: Face Recognition, Deep Learning

    Projects

    .js-id-Self

    EgoMimic: Scaling Imitation Learning via Egocentric Video

    End-to-end robot learning for generalizable bimanual robot manipulation from embodied human data

    Asia-Pacific Robotics Contest 2018 & 2019

    Autonomous Navigation for OmniDrive and Quadruped robots

    Behavior Cloning and Dynamics Models for Robot Manipulation

    MLP, RNN, Diffusion policy variants and learning dynamics models in Robomimic

    Vision-Language Models for Dense Feedback Reward

    VLMs for Natural Language Human Feedback

    UAV-based Assessment of Civil Structures

    Automated building inspection using the aerial images captured using UAV.

    Obstacle Avoidance for UAV

    Predicting an obstacle-free patch for high level control commands

    Fytbuddy: A real-time gym fitness trainer

    A web app-based e-trainer using a flask web server and a Deep Learning-based model for posture correction

    Autonomous Agricultural Robot

    AGRIBOT to solve crop weed classification problem

    RFID System

    Identification system using RFID reader, LCD display and Atmel AVR microcontroller.

    Image Super-Resolution

    A triplet loss-based optimization framework for Image Super-Resolution

    Mapping for level-5 Autonomy

    Improved point cloud registration and mapping using semantic ICP

    Object Detection in Adverse weather setting

    Gated Differentiable Image Processing (GDIP), a domain-agnostic architecture for object detection in adverse conditions.

    Implementation of Path Searching/Tracking algorithms

    Implemented path search/track algorithms like Pure Pursuit, Djikstra, A-star etc.

    Teaching

    Graduate Teaching Assistant
  • Georgia Tech CS 6476: Computer Vision (grad-level) (Fall 2024)
  • Georgia Tech CS 6476: Computer Vision (undergrad-level), Georgia Tech (Spring 2024)
  • Georgia Tech CS 6476: Computer Vision (grad-level), Georgia Tech (Fall 2023)
  • Featured

    • UAV-based Visual Remote Sensing for Automated Building Inspection (UVRSABI)
      • Spotlight paper presentation at the CVCIE Workshop at ECCV 2022
      • Inaugurated by Central Road Research Institute to deploy in Telangana, India (Sept 2022)
    • AGRIBOT got funded under TEQIP-III program by Govt. of India, and we also presented at the open-source ROS Agriculture community meet. [YouTube]
    • Secured 12th and 13th rank in Asia-Pacific Robot Contest - RoboCon 2018 and 2019 respectively, among 100-plus universities. [RoboCon2019 YouTube] [RoboCon2018 YouTube]
    • Best Working Model - Stirling Engine at the National Science Day Celebrations, Physical Research Laboratory (PRL), India, during 12th grade.