The Markovian Price of Information.
2019
Suppose there are $n$ Markov chains and we need to pay a per-step \emph{price} to advance them. The "destination" states of the Markov chains contain rewards; however, we can only get rewards for a subset of them that satisfy a combinatorial constraint, e.g., at most $k$ of them, or they are acyclic in an underlying graph. What strategy should we choose to advance the Markov chains if our goal is to maximize the total reward \emph{minus} the total price that we pay?
In this paper we introduce a Markovian price of information model to capture settings such as the above, where the input parameters of a combinatorial optimization problem are given via Markov chains. We design optimal/approximation algorithms that jointly optimize the value of the combinatorial problem and the total paid price. We also study \emph{robustness} of our algorithms to the distribution parameters and how to handle the \emph{commitment} constraint.
Our work brings together two classical lines of investigation: getting optimal strategies for Markovian multi-armed bandits, and getting exact and approximation algorithms for discrete optimization problems using combinatorial as well as linear-programming relaxation ideas.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
43
References
11
Citations
NaN
KQI