Designing Truthful Contextual Multi-Armed Bandits based Sponsored Search Auctions
2020
We consider the Contextual Multi-Armed Bandit (ConMAB) problem for sponsored search auction (SSA) in the presence of strategic agents. The problem has two main dimensions: i) Need to learn unknown click-through rates (CTR) for each agent and context combination and ii) Elicit true bids from the agents. Thus, we address the problem to design non-exploration-separated truthful MAB mechanism in the presence of contexts (aka side information). Towards this, we first design an elimination-based ex-post monotone algorithm ELinUCB-SB, thus leading to an ex-post incentive compatible mechanism. M-ELinUCB-SB outperforms the existing mechanisms available in the literature; however, theoretically, the mechanism may incur linear regret in some instances. We next design SupLinUCB -based allocation rule SupLinUCB-S which obtains a worst-case regret of O(n2/sup> √dT log T) as against O(n √dT log T) for non-strategic settings; O(n) is price of truthfulness. We demonstrate the efficacy of both of our mechanisms via simulation and establish superior performance over the existing literature.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
0
Citations
NaN
KQI