Home
People
Publications
Research
Education
News & Events
Seminar Series
Contacts
Prospective Students
Seminars and News

GRASP Lab Seminar 2003-2004

February 20, 11:00 AM, Levine Hall 307, hosted by Vijay Kumar.

Greg Grudic
University of Colorado

Using Probabilistic Regression and Classification Models for End-to-End Learning of Robotic Tasks

Abstract: A perpetual question in mobile robot control is how to describe tasks and programs in terms of behaviors or modes. We address this problem by having a human teleoperator demonstrate a task, and applying a Supervised Learning approach that identifies modes. This is achieved by determining regions of the state space where actuator outputs are robustly predicted by compact nonlinear mappingsof sensory inputs. Thus, we impose no human intuitions (anthropomorphic bias) on what a robotcontrol mode should be, but use nonlinear statistical techniques to discover them from direct task demonstration.

Within this context, in order to learn what a mode is, we need regression and classification models that give point specific estimates of how good a prediction is. We present such a framework where classifiers yield C(X)and P(C(X) = True Class | X) for instance X, and regression models yield R(X) and Pr(y_1 < y_true < y_2 | X)for instance X and user specified y_1 and y_2. Our approach makes minimal distribution assumptions, with no specific distributions (e.g. Gaussian,etc) assumed. Experimental results demonstrate that our framework outperforms commonly used algorithms in probability estimates, while still maintaining state of the art regression and classification accuracy.

Finally, I will describe our cyclic two stage process for learning sensor to actuator mappings for end-to-end robot task learning. The first stage, called the Supervised Learning stage, involves a human operator demonstrating the task via teleoperation, thereby generating a sequence of sensor input to actuator output pairs. These input output pairs are sent to a supervised learning algorithm, which builds a set of modes approximating the human's control strategy. This learned mode-switching controller is passed to the second stage, the Reinforcement Learning stage, which autonomously updates the mode boundaries to improve performance based on environment feedback. The learning process seamlessly cycles through these two stages, continually under human supervision. If the process is successful, the desired robot task is eventually learned by the robot and no further human intervention is needed. We consider the task learned when the success rate of the autonomous controller is similar to the performance of a human operator.

Biography: Greg Grudic received his B.A.Sc. degree in Engineering Physics, his M.A.Sc. degree in Electrical Engineering, and his Ph.D. degree (1997) in Electrical and Computer Engineering, all from the University of British Columbia, Canada. From 1998 to 2001 he was a Post-Doctoral Fellow at the IRCS and the GRASP Lab, at the University of Pennsylvania. In August 2001 he joined the Department of Computer Science at the University of Colorado at Boulder, where he is now an Assistant Professor.

full schedule

top of page