This will be a hybrid event with in-person attendance in Wu and Chen and virtual attendance on Zoom.
Recent advances in computer vision have led to the rise of highly expressive 3D scene models such as NeRFs and GSplats. More than just rendering lifelike images, these models allow robots to ground visual, semantic, physical, and affordance properties in a common 3D model, to rearrange objects in the scene and even simulate physical interactions. In this talk I will describe our efforts to build new robot autonomy features around these models, while preserving safety, modularity, and interpretability. I will present navigation algorithms for robots to safely maneuver through their environment using NeRFs and GSplats, even while training the model online in a SLAM-like fashion. I will describe methods to embed semantic and affordance information into radiance fields, giving robots a 3D grounding for understanding and executing tasks from natural language commands. Finally I will describe using these neural models as high-fidelity training environments for learning end-to-end visuo-motor policies. I will demonstrate such a policy for navigating a drone through an obstacle-rich environment while being robust to significant visual distractors. I will conclude with future opportunities and challenges in neural environment models for robotics.