|
|
 |

GRASP Seminar Series: Fall 2008
September 26th , 11:00 a.m., Wu & Chen Auditorium, Levine Hall
(3330 Walnut Street)
Alexander Rakhlin
University of California, Berkeley
"Online Learning with Limited Feedback"
Abstract: One's ability to learn and make decisions rests heavily on the
availability of feedback. In sequential decision-making problems such
feedback is often limited. A gambler, for example, can observe
entirely the outcome of a horse race regardless of where he placed his
bet; however, when the same gambler chooses his route to travel to the
race track, perhaps at a busy hour, he will likely never learn the
outcome of possible alternatives. The latter limited-feedback problem
is the focus of this talk.
The problem can be phrased as an Online Linear Optimization game with
``bandit'' feedback. The existence of a low-regret algorithm has been
an open question since the work of Awerbuch and Kleinberg in 2004.
A recent reduction to the multi-armed bandit problem allowed
Dani, Hayes, and Kakade to achieve a low-regret guarantee, unfortunately
at the expense of computational efficiency. We present the first known
efficient low-regret algorithm for bandit Online Linear
Optimization over arbitrary convex decision sets. We show how the
difficulties encountered by previous approaches are overcome by
employing regularization. Our solution reveals surprising connections
between online learning and Interior Point methods in Optimization.
We also discuss an emerging phenomenon: regret guarantees for stochastic and
adversarial settings turn out to be the same.
In particular, our method solves the Online Shortest Path problem: at
each round, a path from source to sink is chosen and only the total
length (delay) of this path is revealed. A low-regret algorithm for this
problem has applications in network routing, resource allocation, dynamic
treatment of patients, and more. The worst-case guarantees enjoyed by our
method imply robustness with respect to noise and malicious adversary.
Joint work with Jacob Abernethy and Elad Hazan.
Full Seminar schedule...
|
 |