Home
People
Publications
Research
Education
News & Events
Seminar Series
Contacts
Prospective Students
Seminars and News

GRASP Lab Seminar 2004-2005

November 12, 11:00 AM, Levine Hall 307.

Yann LeCun
Courant Institute of Mathematical Sciences, NYU

End-to-End Training of Energy-Based Models with Applications to Vision and Robotics

Abstract: Probabilistic graphical models associate a (normalized) probability for each configuration of the variables to be predicted. Ensuring proper normalization justifies many popular learning techniques (e.g. maximum likelihood), but sometimes leads to computational intractabilities. By contrast, energy-based models (EBM) merely associate an energy to each configuration of the variables, eliminating the need for proper normalization. One can view the energy function as a measure of "compatibility" between the values of the observed variables, the variables to be predicted, and the latent variables. Performing an inference consists in comparing the energies associated with various configurations of non-observed variables and choosing the configuration with the smallest energy. Using EBMs circumvents the requirement to normalize the models and compute partition functions that may be intractable, but it requires the use of loss functions that appropriately "carve" the energy landscape so as to place minima at desired locations. We present a large family of suitable loss functions, within which traditional loss functions, such as negative log-likelihood, form a small subset. We show how the EBM framework can be used to build large-scale vision systems that can be trained end-to-end, from raw pixel images to ultimate outputs. We will describe, show videos, and run real-time live demos of various vision application of end-to-end EBM training. This will include: - a real-time system for simultaneously detecting human faces in images and estimating their pose. - a mobile robot that was trained to emulate a human driver so as to avoid obstacles in natural environment solely from stereo image pairs. - A real-time system for detecting and recognizing generic objects such as vehicles, people, airplanes, and animals, with full invariance to pose, illumination, and clutter. Parts of this work are joint with Fu Jie Huang (NYU), Leon Bottou (NEC), Rita Osadchy (NEC), and Matt Miller (NEC).

Biography: Yann LeCun is a Professor of Computer Science with the Courant Institute of Mathematical Sciences at NYU. He received an Engineer Diploma from ESIEE (Paris) (1983) and a Ph. D. in Computer Science from Universite Curie (Paris) (1987). After a postdoctoral fellowship at the University of Toronto, Dr. LeCun joined the Adaptive Systems Research Department at AT&T Bell Laboratories in 1988. Following the AT&T/Lucent spin-off 1996 he joined AT&T Labs-Research as head of the Image Processing Research Department. In 2002, he became a Fellow at the NEC Research Institute in Princeton. He joined the NYU faculty in 2003. Dr. LeCun's research interests include computational and biological models of learning and perception, computer vision, robotics, information theory, data compression, digital libraries, and the physical basis of computation. Some of the methods and technologies he developed are in wide commercial use for pattern recognition applications, data mining systems, and digital libraries. His handwriting recognition systems are used by many banks to automate check processing, and his DjVu document image compression system is used by hundreds of online digital libraries around the world.

Seminar schedule

top of page