Yesterday's Reward is Today's Punishment: Contrast Effects in Human Feedback to Reinforcement Learning Agents

Divya Ramesh,Anthony Z. Liu,Andres J. Echeverria,Jean Y. Song,Nicholas R. Waytowich,Walter S. Lasecki

Yesterday's Reward is Today's Punishment: Contrast Effects in Human Feedback to Reinforcement Learning Agents

2020

Autonomous agents promise users of a personalized future, allowing them to direct their attention to tasks most meaningful to them. However, the demands of personalization stand unfulfilled by current agent training paradigms such as machine learning, which require many orders of data to train agents on a single task. In sequential decision making domains, Reinforcement Learning (RL) enables this need, when a priori training of desired behaviors is intractable. Prior work has leveraged user input to train agents by mapping them to numerical reward signals. However, recent approaches have identified inconsistent human feedback as a bottleneck to achieving best-case performance. In this work, we present empirical evidence to show that human perception affected by contrast effects distorts their feedback to Reinforcement Learning agents. Through a set of studies involving 900 participants from Amazon Mechanical Turk who were asked to give feedback to RL agents, we show that participants significantly underrate an agent's actions after being exposed to an agent of higher competence on the same task. To understand the significance of this effect on agent performance during training, we then simulate trainers that underrate actions of an agent based on past performance - creating a systematically skewed feedback signal - integrated into an actor-critic framework. Our results show that agent performance is reduced by up to 98% in the presence of systematic skews in human feedback in Atari environments. Our work provides a conceptual understanding of a source of inconsistency in human feedback, thus informing the design of human-agent interactions.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations