PhD student in Statistical Machine Learning
Dept. of Computing Science, RLAI & AMII
University of Alberta
gauthamv dot 529 at gmail dot com

Google Scholar Profile

       

About Me

I read. I write. I build stuff.

I’m interested in building machines with animal-like intelligence. Specifically, I aim to understand algorithmic principles that could enable robots to continually learn, adapt, develop, and improve throughout their lives. In pursuit of this goal, I design and develop reinforcement learning (RL) algorithms and continual learning systems for real-world robots.

I also have a strong side interest in Neuroscience, Evolutionary Biology and Quantum Computing. All are part of an overarching goal to understand the emergence of intelligence.


NIT Trichy, India
2011 - 2015
IIIT Hyderabad, India
Summer 2014
Kindred Systems Inc, Canada
2017 - 2020
University of Alberta, Canada
2015 - 2017
2020 - Present
University of Freiburg, Germany
March - June, 2023

I’m a PhD student in Statistical Machine Learning at the University of Alberta. I work under the supervision of Rupam Mahmood with the Reinforcement Learning and Artificial Intelligence (RLAI) Group. My PhD research focuses on real-time, online learning and continual adaptation on robots. I mostly focus on policy gradient methods, real-time learning architectures and model-based reinforcement learning.

Recently, I visited the Neurobotics Lab headed by Joschka Boedecker at the University of Freiburg, Germany. We worked on a novel framework for skill learning and adaptation for assitive robots using reinforcement learning. This also involves the integration of very noisy electroencephalogram (EEG) signals decoded from a patient’s brain, which includes preference and failure information.

Previously, I was a Machine Learning Researcher at Kindred Systems Inc. As a member of the AI Research Team in Toronto, I developed Deep Reinforcement Learning techniques to improve the product’s (SORT) overall throughput at e-commerce fulfillment centres like Gap Inc, etc. I was also responsible for the design, implementation and evaluation of learning algorithms and robot infrastructure as a part of the research and publication efforts at Kindred (e.g., SenseAct). I spent three wonderful years at Kindred; intially working with Geordie Rose, Suzzanne Gildert and Olivia Norton in Vancouver and subsequently with James Bergstra, Dmytro Korenkevych and Rupam Mahmood in Toronto.

I graudated with an M.Sc (Thesis) in Computing Science from the University of Alberta in 2017. I worked under the supervision of Patrick M. Pilarski with the BLINC and RLAI labs. During my masters, I mostly worked with rehabilitative and assistive robots. My thesis research was on ‟Teaching a Powered Prosthetic Arm with an Intact Arm Using Reinforcement Learning”. We used ideas from Learning from Demonstration, Actor-Critic Reinforcement Learning and explored the possibilities for synergistic, context-aware control of a prosthetic arm. This work won the 2017 M.Sc. Outstanding Thesis Award in Computing Science.

In a past life, I studied Instrumentation and Control Engineering at the National Institute of Technology (NIT), Tiruchirappalli. Under the guidance of G. Saravana Ilango, my team devised control strategies for an autonomous robotic vacuum cleaner for solar panels which garnered accolades at the Texas Instruments Innovation Challenge (2014). In addition, I evaluated methods for model predictive control, real-time trajectory generation and motion planning for quadcopters while working with K. Madhava Krishna and V. Sankaranarayanan.

I also maintain an academic blog titled Machinae Animatae and a personal blog titled Musings of an Enlightened Idiot.

My CV is available here.


Talks

Two Issues of Autonomous Robot Learning

I discuss some practical, oft-ignored challenges in continual learning on real-world robots. More specifically I address two research questions: (i) How to specify reinforcement learning tasks?, and (ii) How to set up a real-time learning agent?

Tea Time Talks: Reward (Mis-)Specification in Reinforcement Learning

natChat: Neurotech in Artificial Intelligence (2023)


Publications

Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers

Gautham Vasan, Mohamed Elsayed, Alireza Azimi*, Jiamin He*, Fahim Shahriar, Colin Bellinger, Martha White, A. Rupam Mahmood
NeurIPS 2024
Paper Code


Streaming Deep Reinforcement Learning Finally Works

Mohamed Elsayed, Gautham Vasan, A. Rupam Mahmood
Pre-Print, NeurIPS FitML Workshop 2024
Paper


Revisiting Sparse Rewards for Goal-Reaching Reinforcement Learning

Gautham Vasan, Yan Wang, Fahim Shahriar, James Bergstra, Martin Jagersand, A. Rupam Mahmood
RLC 2024
Paper Demo


Versatile and Generalizable Manipulation via Goal-Conditioned Reinforcement Learning with Grounded Object Detection

Huiyi Wang, Fahim Shahriar, Alireza Azimi, Gautham Vasan, A. Rupam Mahmood, Colin Bellinger
CoRL MRM-D Workshop 2024
Paper


Autonomous Skill Acquisition for Robots Using Graduated Learning

Gautham Vasan
AAMAS Doctoral Consortium 2024
Paper


MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning

Bram Grooten, Tristan Tomilin, Gautham Vasan, Matthew E Taylor, A Rupam Mahmood, Meng Fang, Mykola Pechenizkiy, Decebal Constantin Mocanu
AAMAS 2024
Paper Code Video


Correcting Discount-Factor Mismatch in On-Policy Policy Gradient Methods

Fengdi Che, Gautham Vasan, A. Rupam Mahmood
ICML 2023
Paper


Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers

Yan Wang*, Gautham Vasan*, A. Rupam Mahmood
ICRA 2023
Paper Code Video


Autoregressive policies for continuous control deep reinforcement learning

Dmytro Korenkevych, A Rupam Mahmood, Gautham Vasan, James Bergstra
IJCAI 2019
Paper Post Video


Benchmarking reinforcement learning algorithms on real-world robots

A Rupam Mahmood, Dmytro Korenkevych, Gautham Vasan, William Ma, James Bergstra
CoRL 2018
Paper Post Video SenseAct

UR-Reacher-2


Context-Aware Learning from Demonstration: Using Camera Data to Support the Synergistic Control of a Multi-Joint Prosthetic Arm

Gautham Vasan, Patrick M Pilarski
BioRob 2018
Paper Poster


Learning from demonstration: Teaching a myoelectric prosthesis with an intact limb via reinforcement learning

Gautham Vasan, Patrick M Pilarski
ICORR 2017
Spotlight presentation at Rehabweek 2017.
Paper Video & Metadata


Confident Decision Making with General Value Functions

Craig Sherstan, Marlos C. Machado, Jaden Travnik, Adam White, Gautham Vasan, Patrick M. Pilarski
RLDM 2017


Mirrored Bilateral Training of a Myoelectric Prosthesis with a non-amputated arm via Actor-Critic Reinforcement Learning

Gautham Vasan, Patrick M Pilarski
RLDM 2017
Spotlight presentation (20min)
Poster Slides


Neurohex: A Deep Q-Learning Hex Agent

Kenny Young, Gautham Vasan, Ryan Hayward
Computer Games Workshop at IJCAI 2016
Paper


Autonomous Visual Tracking and Landing of a Quadrotor on a Moving Platform

Juhi Ajmera, PR Siddharthan, KM Ramaravind, Gautham Vasan, Naresh Balaji, V Sankaranarayanan
IEEE ICIIP 2015
Paper Video


A Control Strategy for an Autonomous Robotic Vacuum Cleaner for Solar Panels

G Aravind, Gautham Vasan, TSB Gowtham Kumar, R Naresh Balaji, G Saravana Ilango
IEEE Texas Instruments India Educators Conference 2014
Phase-I Winners and finalists (top 19 among 2000+ teams)
Paper Video Presentation


Miscellaneous Stuff I’m Proud To Have Been A Part Of

  • A short stint as a research volunteer with The Hospital for Sick Children (SickKids).
  • Cerebral Palsy and Spasticity Trials. I had the pleasure of working with medical doctors on a study assessing functional gain in patients affected by stroke or spasticity using assistive robots. I built tools to analyze the recorded sensory information and setup a robot interface for 12 patients. Thanks to Patrick Pilarski, Trevor Lashyn, Matthew Curran and Ming Chan at the University of Alberta for looping me in!
  • Festember, the annual International cultural festival of NIT Trichy. Festember is especially close to my heart since it was a labor of love. A lot of talented, passionate folks came together and poured tremendous effort, time and resources to create something special :) I spent two great years with the Marketing team and was the Treasurer of Festember 2014. As the Treasurer, I handled the finances of the festival (~INR 20 Million) and executed several key decisions with regards to budget, expenditure, resource management for teams, etc.
  • Spider, an R&D club at NIT Trichy. We conducted tech talks and workshops focusing on microcontrollers and embedded programming.
  • Pragyan, the annual technical festival of NIT Trichy. I worked with the Guest Lectures and Crossfire Teams. Personal highlights: (i) helped moderate a wonderful panel discussion on “Failing educational institutions?” (ii) Jamie Hyneman, host of Discovery Mythbusters as a guest lecturer for Pragyan 14!

The End