A User Comfort Model and Index Policy for Personalizing Discrete Controller Decisions

Marcel Menner,Melanie Nicole Zeilinger

A User Comfort Model and Index Policy for Personalizing Discrete Controller Decisions

2018

User feedback allows for tailoring system operation to ensure individual user satisfaction. A major challenge in personalized decision-making is the systematic construction of a user model during operation while maintaining control performance. This paper presents both an index-based control policy to smartly collect and process user feedback and a user comfort model in the form of a Markov decision process with a priori unknown user-specific state transition probabilities. The control policy utilizes explicit user feedback to optimize a reward measure reflecting user comfort and addresses the explorationexploitation trade-off in a multi-armed bandit framework. The proposed approach combines restless bandits and upper confidence bound algorithms. It introduces an exploration term into the restless bandit formulation, utilizes user feedback to identify the user model, and is shown to be indexable. We demonstrate its capabilities with a simulation for learning a user's trade-off between comfort and energy usage.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations