COMPUTER SCIENCE TECHNICAL REPORT ABSTRACTS

CMU-CS-07-102
Computer Science Department
School of Computer Science, Carnegie Mellon University

CMU-CS-07-102

Approximation Algorithms Going Online

Sham Kakade*, Adam Tauman Kalai**, Katrina Ligett

January 2007

Keywords: Approximation algorithms, regret minimization, online linear optimzation

In an online linear optimization problem, on each period t, an online algorithm chooses st ЂК S from a fixed (possibly infinite) set S of feasible decisions. Nature (who may be adversarial) chooses a weight vector w_t ∈Rⁿ, and the algorithm incurs cost c(s_t,w_t), where c is a fixed cost function that is linear in the weight vector. In the full-information setting, the vector w_t is then revealed to the algorithm, and in the bandit setting, only the cost experienced, c(s_t,w_t), is revealed. The goal of the online algorithm is to perform nearly as well as the best fixed s ∈S in hindsight. Many repeated decision-making problems with weights fit naturally into this framework, such as online shortest-path, online TSP, online clustering, and online weighted set cover.

Previously, it was shown how to convert any efficient exact offline optimization algorithm for such a problem into an efficient online bandit algorithm in both the full-information and the bandit settings, with average cost nearly as good as that of the best fixed s ∈S in hindsight. However, in the case where the offline algorithm is an approximation algorithm with ratio α > 1, the previous approach only worked for special types of approximation algorithms.

We show how to convert any efficient offline α-approximation algorithm for a linear optimization problem into an efficient algorithm for the corresponding online problem, with average cost not much larger than α times that of the best s ∈S, in both the full-information and the bandit settings. Our main innovation is in the full-information setting: we combine Zinkevich's algorithm for convex optimization with a geometric transformation that can be applied to any approximation algorithm. In the bandit setting, standard techniques apply, except that a "Barycentric Spanner" for problem is also (provably) necessary as input.

Our algorithm can also be viewed as a method for playing a large repeated games, where one can only compute approximate best-responses, rather than best-responses.

19 pages

*Toyota Technological Institute at Chicago
**Georgia Institute of Technology

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu