This was a hybrid event with in-person attendance in Towne 337 and virtual attendance…
Generalist robot policies, trained on large and diverse robot datasets, have the potential to transform how robot learning research is done: in the same way that current models in NLP are almost universally derived from pretrained large language models, future robot policies might be initialized from generalist robot models and finetuned with only modest amounts of target domain data.
In this talk I will discuss our efforts on building such generalist robot policies. I will focus on two key ingredients: data and models. On the data side, I will discuss our recent works on building the largest open-source real robot manipulation datasets to date, the Open X-Embodiment dataset and DROID, with a total of 2M+ robot trajectories. On the model side, I will summarize our learnings from building RT-X and Octo, the first generalist robot policies trained on the Open X-Embodiment dataset. I will discuss their current limitations and outline important steps for future research towards ubiquitous robot foundation models.