An adaptive dynamic allocation algorithm.
2019
We consider the classical problem of sequential resource allocation where a decision maker must repeatedly divide a budget between several resources, each with diminishing returns. This can be recast as a specific stochastic optimization problem where the objective is to maximize the cumulative reward, or equivalently to minimize the regret. We construct an algorithm that is {\em adaptive} to the complexity of the problem, expressed in term of the regularity of the returns of the resources, measured by the exponent in the {\L}ojasiewicz inequality (or by their universal concavity parameter). Our parameter-independent algorithm recovers the optimal rates for strongly-concave functions and the classical fast rates of multi-armed bandit (for linear reward functions). Moreover, the algorithm improves existing results on stochastic optimization in this regret minimization setting for intermediate cases.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
30
References
0
Citations
NaN
KQI