Asymptotically Optimal Information-Directed Sampling

Johannes Kirschner,Tor Lattimore,Claire Vernade,Csaba Szepesvári

Asymptotically Optimal Information-Directed Sampling

2021

Johannes Kirschner
Tor Lattimore
Claire Vernade
Csaba Szepesvári

We introduce a simple and efficient algorithm for stochastic linear bandits with finitely many actions that is asymptotically optimal and (nearly) worst-case optimal in finite time. The approach is based on the frequentist information-directed sampling (IDS) framework, with a surrogate for the information gain that is informed by the optimization problem that defines the asymptotic lower bound. Our analysis sheds light on how IDS balances the trade-off between regret and information and uncovers a surprising connection between the recently proposed primal-dual methods and the IDS algorithm. We demonstrate empirically that IDS is competitive with UCB in finite-time, and can be significantly better in the asymptotic regime.

Keywords:

Connection (vector bundle)
Asymptotically optimal algorithm
Frequentist inference
Regret
simple
Sampling (statistics)
Mathematical optimization
Upper and lower bounds
Optimization problem
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations