Gambler Bandits and the Regret of Being Ruined (accepted paper)

2021 
In this paper we consider a particular class of problems called multiarmed gambler bandits (MAGB) which constitutes a modified version of the Bernoulli MAB problem where two new elements must be taken into account: the budget and the risk of ruin. The agent has an initial budget that evolves in time following the received rewards, which can be either +1 after a success or −1 after a failure. The problem can also be seen as a MAB version of the classic gambler's ruin game. The contribution of this paper is a preliminary analysis on the probability of being ruined given the current budget and observations, and the proposition of an alternative regret formulation, combining the classic regret notion with the expected loss due to the probability of being ruined. Finally, standard state-of-the-art methods are experimentally compared using the proposed metric.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []