Designing Truthful Contextual Multi-Armed Bandits based Sponsored Search Auctions

2020 
We consider the Contextual Multi-Armed Bandit (ConMAB) problem for sponsored search auction (SSA) in the presence of strategic agents. The problem has two main dimensions: i) Need to learn unknown click-through rates (CTR) for each agent and context combination and ii) Elicit true bids from the agents. Thus, we address the problem to design non-exploration-separated truthful MAB mechanism in the presence of contexts (aka side information). Towards this, we first design an elimination-based ex-post monotone algorithm ELinUCB-SB, thus leading to an ex-post incentive compatible mechanism. M-ELinUCB-SB outperforms the existing mechanisms available in the literature; however, theoretically, the mechanism may incur linear regret in some instances. We next design SupLinUCB -based allocation rule SupLinUCB-S which obtains a worst-case regret of O(n2/sup> √dT log T) as against O(n √dT log T) for non-strategic settings; O(n) is price of truthfulness. We demonstrate the efficacy of both of our mechanisms via simulation and establish superior performance over the existing literature.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []