Dept. of Computing Science, RLAI & AMII
University of Alberta
gauthamv dot 529 at gmail dot com
Google Scholar Profile
About Me
I read. I write. I build stuff.
I’m interested in building machines with animal-like intelligence. Specifically, I aim to understand algorithmic principles that could enable robots to continually learn, adapt, develop, and improve throughout their lives. In pursuit of this goal, I design and develop reinforcement learning (RL) algorithms and continual learning systems for real-world robots.
I also have a strong side interest in Neuroscience, Evolutionary Biology and Quantum Computing. All are part of an overarching goal to understand the emergence of intelligence.
I’m a PhD student in Statistical Machine Learning at the University of Alberta. I work under the supervision of Rupam Mahmood with the Reinforcement Learning and Artificial Intelligence (RLAI) Group. My PhD research focuses on real-time, online learning and continual adaptation on robots. I mostly focus on policy gradient methods, real-time learning architectures and model-based reinforcement learning.
Recently, I visited the Neurobotics Lab headed by Joschka Boedecker at the University of Freiburg, Germany. We worked on a novel framework for skill learning and adaptation for assitive robots using reinforcement learning. This also involves the integration of very noisy electroencephalogram (EEG) signals decoded from a patient’s brain, which includes preference and failure information.
Previously, I was a Machine Learning Researcher at Kindred Systems Inc. As a member of the AI Research Team in Toronto, I developed Deep Reinforcement Learning techniques to improve the product’s (SORT) overall throughput at e-commerce fulfillment centres like Gap Inc, etc. I was also responsible for the design, implementation and evaluation of learning algorithms and robot infrastructure as a part of the research and publication efforts at Kindred (e.g., SenseAct). I spent three wonderful years at Kindred; intially working with Geordie Rose, Suzzanne Gildert and Olivia Norton in Vancouver and subsequently with James Bergstra, Dmytro Korenkevych and Rupam Mahmood in Toronto.
I graudated with an M.Sc (Thesis) in Computing Science from the University of Alberta in 2017. I worked under the supervision of Patrick M. Pilarski with the BLINC and RLAI labs. During my masters, I mostly worked with rehabilitative and assistive robots. My thesis research was on ‟Teaching a Powered Prosthetic Arm with an Intact Arm Using Reinforcement Learning”. We used ideas from Learning from Demonstration, Actor-Critic Reinforcement Learning and explored the possibilities for synergistic, context-aware control of a prosthetic arm. This work won the 2017 M.Sc. Outstanding Thesis Award in Computing Science.
In a past life, I studied Instrumentation and Control Engineering at the National Institute of Technology (NIT), Tiruchirappalli. Under the guidance of G. Saravana Ilango, my team devised control strategies for an autonomous robotic vacuum cleaner for solar panels which garnered accolades at the Texas Instruments Innovation Challenge (2014). In addition, I evaluated methods for model predictive control, real-time trajectory generation and motion planning for quadcopters while working with K. Madhava Krishna and V. Sankaranarayanan.
I also maintain an academic blog titled Machinae Animatae and a personal blog titled Musings of an Enlightened Idiot.
My CV is available here.
Talks
Two Issues of Autonomous Robot Learning
I discuss some practical, oft-ignored challenges in continual learning on real-world robots. More specifically I address two research questions: (i) How to specify reinforcement learning tasks?, and (ii) How to set up a real-time learning agent?
Tea Time Talks: Reward (Mis-)Specification in Reinforcement Learning
natChat: Neurotech in Artificial Intelligence (2023)
Publications
Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers
Streaming Deep Reinforcement Learning Finally Works
Pre-Print, NeurIPS FitML Workshop 2024
Paper
Revisiting Sparse Rewards for Goal-Reaching Reinforcement Learning
Versatile and Generalizable Manipulation via Goal-Conditioned Reinforcement Learning with Grounded Object Detection
CoRL MRM-D Workshop 2024
Paper
Autonomous Skill Acquisition for Robots Using Graduated Learning
AAMAS Doctoral Consortium 2024
Paper
MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning
Correcting Discount-Factor Mismatch in On-Policy Policy Gradient Methods
ICML 2023
Paper
Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers
Autoregressive policies for continuous control deep reinforcement learning
Benchmarking reinforcement learning algorithms on real-world robots
CoRL 2018
Paper
Post
Video
SenseAct
Context-Aware Learning from Demonstration: Using Camera Data to Support the Synergistic Control of a Multi-Joint Prosthetic Arm
Learning from demonstration: Teaching a myoelectric prosthesis with an intact limb via reinforcement learning
ICORR 2017
Spotlight presentation at Rehabweek 2017.
Paper
Video & Metadata
Confident Decision Making with General Value Functions
RLDM 2017
Mirrored Bilateral Training of a Myoelectric Prosthesis with a non-amputated arm via Actor-Critic Reinforcement Learning
RLDM 2017
Spotlight presentation (20min)
Poster
Slides
Neurohex: A Deep Q-Learning Hex Agent
Computer Games Workshop at IJCAI 2016
Paper
Autonomous Visual Tracking and Landing of a Quadrotor on a Moving Platform
A Control Strategy for an Autonomous Robotic Vacuum Cleaner for Solar Panels
IEEE Texas Instruments India Educators Conference 2014
Phase-I Winners and finalists (top 19 among 2000+ teams)
Paper
Video
Presentation
Miscellaneous Stuff I’m Proud To Have Been A Part Of
- A short stint as a research volunteer with The Hospital for Sick Children (SickKids).
- Cerebral Palsy and Spasticity Trials. I had the pleasure of working with medical doctors on a study assessing functional gain in patients affected by stroke or spasticity using assistive robots. I built tools to analyze the recorded sensory information and setup a robot interface for 12 patients. Thanks to Patrick Pilarski, Trevor Lashyn, Matthew Curran and Ming Chan at the University of Alberta for looping me in!
- Festember, the annual International cultural festival of NIT Trichy. Festember is especially close to my heart since it was a labor of love. A lot of talented, passionate folks came together and poured tremendous effort, time and resources to create something special :) I spent two great years with the Marketing team and was the Treasurer of Festember 2014. As the Treasurer, I handled the finances of the festival (~INR 20 Million) and executed several key decisions with regards to budget, expenditure, resource management for teams, etc.
- Spider, an R&D club at NIT Trichy. We conducted tech talks and workshops focusing on microcontrollers and embedded programming.
- Pragyan, the annual technical festival of NIT Trichy. I worked with the Guest Lectures and Crossfire Teams. Personal highlights: (i) helped moderate a wonderful panel discussion on “Failing educational institutions?” (ii) Jamie Hyneman, host of Discovery Mythbusters as a guest lecturer for Pragyan 14!
The End