Abstract: Vision systems must understand a visual world that is spectacularly intricate. A typical scene contains hundreds of surfaces that scatter light in distinct and complex ways, and these surfaces interact by occluding one another, casting shadows, and mutually reflecting light. For vision systems to succeed, they must identify structure within this complexity, and they must maximally exploit this structure.
This requires forward models of the imaging process, and visual inference techniques that rely on such models are often referred to as being ‘physics-based’. Traditionally, physics-based techniques have achieved tractability by making rather severe assumptions about the world (diffuse reflectance, point-source illumination, etc.). While they possess elegant formulations, they can be hard to apply in natural environments.
Over the past few years, our group has been working to relax these restrictions by developing models that more finely balance tractability and accuracy. We seek models that are complex enough to accurately describe the natural world, but at the same time, are simple enough to be ‘inverted’ for inference purposes. The basic goal is to provide a foundation for vision systems that are more likely to succeed in the real world.
In this talk I will summarize our results and provide detailed descriptions of two recent studies that pursue shape and reflectance information in ‘natural’ conditions.