Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information.

2018 
We consider assortment optimization of a product for which a particular attribute can be adjusted in a continuous fashion. Examples include the duration of a loan, the data limit for a cell phone subscription and the greenness of paint. We represent the collection of all product variants as the unit interval and consider the question which subset of products a retailer should offer to customers, in order to maximize profit. We model customer choice behavior by a continuous extension of the multinomial logit model and allow for a capacity constraint on the offered assortment. We study this problem under incomplete information, which constitutes an instance of a continuous combinatorial multi-armed bandit problem. The unknown quantities in the model are estimated by kernel density estimation with Legendre kernels and bounded support, for which we derive new convergence rates. We present an explore-then-exploit policy and show that it endures regret of order $T^{2/3}$ (neglecting logarithmic factors). Also, by showing that any policy in the worst case must endure at least a regret of order $T^{2/3}$, we conclude that our policy is asymptotically optimal.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    2
    Citations
    NaN
    KQI
    []