Challenges in online updating of individual choice models for recommender systems or autonomous decision agents

2017 
Significant choice modelling work in marketing and transport research has been carried out to estimate individual level choice models (ILMs). The extent of such work is however surprisingly limited, if contrasted with the increasing importance of highly personalised products and services across a wide range of markets. Both in the in the transport and marketing sectors, ILMs have been used to capture heterogeneity across decision makers. This individual information can in turn be used, for example to simulate individual level choices. In the transport micro-simulations context however, in the end, results that matters, are in most cases still at the aggregate level and it is obtained aggregating simulated choices of individual agents. On the other hand, as Dumont et al. [1] argue, transport policy impacts on various population segments are also of interest. Therefore disaggregate individual agents’ travel behaviour from micro-simulations can be aggregated to varying degrees to have information of policy effects at for the population segments of interest. It is in marketing however, that the appeal of disaggregate consumers information is far more obvious obvious, e.g. to better target products or campaigns specific groups. And today, because individuals can be easily reached from targeting groups marketing has shift to target individuals. With the digitalisation of consumers markets (e.g. online shopping of products, music, films and tv series streaming services), individual recommendation based on preference learning mechanisms have skyrocketed. It should also be pointed out that personalised services are expanding far beyond Internet consumers markets across various sectors including transport (e.g. with personalised travel information systems). Therefore individual levels preference models are becoming integral part of service design or revenue management strategies of service providers. Moreover, individual preference based recommender systems are likely to gradually mutate into autonomous robotic avatars making decision on behalf of individuals, and such decisions should reflect idiosyncratic preferences of those individuals. Against this background, the contribution of choice modellers to development of preference learning mechanisms for recommender systems has so far been marginal (a rare example is [2]). Indeed research and development of preference learning systems for this kind of applications has and mainly left to computer scientists. In this paper we analyse and discuss how current individual level choice models estimation techniques, as devised by choice modellers for other purposes can contribute on online learning of individual preferences for recommenders systems or autonomous agent applications. In this paper we use simulation to explore how ILMs are efficiently updated online (via Bayesian updating) as new observations from a single individual becomes available, without re-estimating a full sample models. The base ILMs are generated using three different techniques: conditional parameters estimates from sample mixed multinomial logit models (MMNL) [3], as by-product Bayesian estimation of MMNLs [3,4], and a hierarchical Bayes (HB) procedure introduced by Dumont and al. [1]. Subsequently we assess how well the individual levels will perform in updating. Specifically, with our analyses, we intend to address the two challenges detailed below. First, underlying Bayesian estimation or updating of preference parameters there is the hypothesis of stable preferences. In reality, however, preferences can often to drift. We test how responsive the Bayesian updating approach is in capturing preference drift. We also propose a procedure to accelerate it based on artificially increasing prior variance. This procedure is inspired by [1], where controlling prior variance is a means to shift the emphasis between individual choice behaviour and the sample level model used as prior in the HB estimation of their ILMs. The second aspect we investigate is strictly connected to the first. Preference learning systems should be robust to outliers and the implementation of a preference updating system capturing preference drift should be able to distinguish when choices outside of the recommended set represent a preference change or outlier behaviour, resulting, e.g. from an observable (extreme) perturbation of the choice context. The system’s user could expose the nature of the outlier, after he or she has been prompted by the system itself to classify an observation outside the recommendation set. Alternatively, in order to avoid user’s intervention, the mechanism capturing the preference drift could be implemented such that the preference change is triggered after a threshold of clustered observations outside the preference set is reached. A further alternative approach can be based on a rewarding system similar to that implemented in reinforcement learning. When a preference parameter change leads to recommended sets that include the preferred alternative the learning system is rewarded, otherwise it is penalised. The update in parameters would be then related to long-term maximisation of the cumulative reward. The latter two approaches are tested in simulation here. [1]        J. Dumont, M. Giergiczny, and S. Hess, "Individual level models vs. sample level models: contrasts and mutual benefits," Transportmetrica A: Transport Science, vol. 11, pp. 465-483, 2015/07/03 2015. [2]        B. H. Chaptini, "Use of discrete choice models with recommender systems," Massachusetts Institute of Technology, 2005. [3]        K. Train. (2006). Mixed Logit Estimation by Maximum Simulated Likelihood. Chapter 11. Available: http://elsa.berkeley.edu/Software/abstracts/train1006mxlmsl.html [4]        K. Train. (2006). Mixed Logit Estimation by Maximum Simulated Likelihood. Chapter 12. Available: http://elsa.berkeley.edu/Software/abstracts/train1006mxlmsl.html
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []