Stochastic Multi-armed Bandits with Arm-specific Fairness Guarantees

Vishakha Patil,Ganesh Ghalme,Vineet Nair,Yadati Narahari

Stochastic Multi-armed Bandits with Arm-specific Fairness Guarantees

2019

Vishakha Patil
Ganesh Ghalme
Vineet Nair
Yadati Narahari

We study an interesting variant of the stochastic multi-armed bandit problem in which each arm is required to be pulled for at least a given fraction of the total available rounds. We investigate the interplay between learning and fairness in terms of a pre-specified vector specifying the fractions of guaranteed pulls. We define a Fairness-aware regret that takes into account the above fairness constraints and extends the conventional notion of regret in a natural way. We show that logarithmic regret can be achieved while (almost) satisfying the fairness requirements. In contrast to the current literature where the fairness notion is instance dependent, we consider that the fairness criterion is exogenously specified as an input to the algorithm. Our regret guarantee is universal i.e. holds for any given fairness vector.

Keywords:

Mathematical optimization
Logarithm
Mathematics
Regret

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations