This event was virtual via Zoom…
The past decade we have seen remarkable progress in Computer Vision, mainly fueled by the recent advances in Deep Learning. Unsurprisingly, human perception has been the center of attention. We now have access to systems that can work remarkably well for traditional 2D tasks like segmentation or pose estimation. However, scaling this to 3D remains particularly challenging because of the inherent ambiguities and the scarcity of annotations.
In this talk I will focus on 3 important problems in reconstructing 3D bodies from images and how my research attempts to solve them. First, I will talk about the limited availability of annotated data and I will propose a method for addressing it. Next, I will present my work on modeling the ambiguities in 3D human reconstruction and demonstrate its usefulness for solving a variety of downstream tasks. Last, I will move beyond single-person 3D pose estimation and show how we can scale our methods to work on scenes with multiple humans.