Mandi Zhao

I am a PhD student at Stanford University advised by Prof. Shuran Song. My B.S. and M.S. are from UC Berkeley, where I was fortunate to work with Prof. Pieter Abbeel and be a part of Berkeley AI Research (BAIR).

I'm broadly interested in AI for robotics: data-driven approaches that enable embodied systems to perceive, reason, and make sequential decisions in the real world.

Email  /  GScholar  /  Github  /  Twitter

profile photo
News
  • Fall 2023: After a year at Columbia University, I moved with my lab to Stanford and will continue my PhD here.
  • Summer 2022: Research intern at Meta AI in Pittsburgh, PA.

MD-Splatting: Learning Metric Deformation from 4D Gaussians in Highly Deformable Scenes
Bardienus P. Duisterhof, Zhao Mandi, Yunchao Yao, Jia-Wei Liu, Mike Zheng Shou, Shuran Song, Jeffrey Ichnowski

[arXiv] [Project Website]

We achieve simultaneous 3D dense point tracking and dynamic novel view synthesis on highly deformable objects. Our method, MD-Splatting, builds on recent advances in Gaussian splatting and learns a deformation function to project a set of canonical Gaussians into metric space, and enforce physics-inspired regularization terms based on local rigidity, conservation of momentum, and isometry.

RoCo: Dialectic Multi-Robot Collaboration with Large Language Models
Zhao Mandi, Shreeya Jain, Shuran Song
IEEE International Conference on Robotics and Automation (ICRA), 2024.
[arXiv] [Project Website]

A novel approach to multi-robot collaboration that harnesses the power of pre-trained large language models (LLMs) for both high-level communication and low-level path planning. Robots are equipped with LLMs to discuss and collectively reason task strategies; then generate sub-task plans and task space waypoint paths, which are used by a multi-arm motion planner to accelerate trajectory planning.

CACTI: A Framework for Scalable Multi-Task Multi-Scene Visual Imitation Learning
Zhao Mandi, Homanga Bharadhwaj, Vincent Moens, Shuran Song, Aravind Rajeswaran, Vikash Kumar

[arXiv] [Project Website]

A framework for multi-task, multi-scene robotic manipulation. It easily scales to many tasks, and uses recent advances in text2image generative models (e.g. stable-diffusion) to augment demonstration data with realistic visual variances.

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning
Zhao Mandi, Pieter Abbeel, Stephen James
Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS), 2022
[arXiv] [Project Website and Code]

We show that multi-task pretraining with fine-tuning can perform equally as well, or better, than meta-pretraining with meta test-time adaptation. We evaluate on a novel setting using vision-based RL benchmarks, including Procgen, RLBench, and Atari, where training is done across distinct tasks, and evaluations are made on completely novel tasks.

Towards More Generalizable One-shot Visual Imitation Learning
Zhao Mandi*, Fangchen Liu*, Kimin Lee, Pieter Abbeel
IEEE International Conference on Robotics and Automation (ICRA), 2022.
[arXiv] [Project Website and Code]

We extend one-shot imitation learning to an ambitious multi-task setup, and support this formulation by a 7-task vision-based robotic manipulation benchmark. We propose our method, MOSAIC, that tackles challenges in multi-task one-shot imitation by improving network architecture and self-supervised representation learning.

DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning
Daniel Seita, Abhinav Gopal, Zhao Mandi, John Canny
Preprint, in submission, 2021.
[arXiv] [Project Website and Code]

We propose a conceptually simple way to manage a data curriculum to provide samples from a teacher to a student, and show that this facilitates learning in offline and mostly-offline RL.

R-LAtte: Visual Control via Deep Reinforcement Learning with Attention Network
Mandi Zhao, Qiyang Li, Aravind Srinivas, Ignasi Clavera, Kimin Lee, Pieter Abbeel
Deep Reinforcement Learning Workshop at Neural Information Processing Systems (NeurIPS), December 2020. Virtual.

We show that an attention-augmented design for encoder architecture provides a siginifanct improvement to sample efficiency and policy performance in pixel-based deep RL for continous control.

ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations
Daniel Seita, Chen Tang, Roshan Rao, David Chan, Mandi Zhao, John Canny
Deep Reinforcement Learning Workshop at Neural Information Processing Systems (NeurIPS), December 2019. Vancouver, Canada.
[arXiv] [Code]

We investigate whether it makes sense to provide samples that are at a reasonable level of "difficulty" for a learner agent, and empirically test on the standard Atari 2600 benchmark.

Miscellaneous
W&B Report: Navigating Over-parametrized Feature Space with Meta-Gradients
My self-study notes for UC Berkeley CS285: Deep Reinforcement Learning
cs182 A study of overfitting and generalization of RL in ProcGen Game Environments

Website template from Jon Barron.