Towards Practical Lipschitz Stochastic Bandits.

2019 
Stochastic Lipschitz bandit algorithms are methods that govern exploration-exploitation tradeoffs, and have been used for a variety of important task domains, including zeroth order optimization. While beautiful theory has been developed for the stochastic Lipschitz bandit problem, the methods arising from these theories are not practical, and accordingly, the development of practical well-performing bandit algorithms has stalled in recent years. To remedy this, we present a framework for bandit methods that flexibly learns partitions of context- and arm-space. Due to this flexibility, the algorithm is able to efficiently optimize rewards and minimize regret, by focusing on the portions of the space that are most relevant. Our experiments show that (1) using adaptively-learned partitioning, our method can surpass existing stochastic Lipschitz bandit algorithms, and (2) our algorithms can achieve state-of-the-art performance in the challenging optimization of neural network hyperparameter tuning.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    5
    Citations
    NaN
    KQI
    []