Learning From Explanations Using Sentiment and Advice in RL

2017 
In order for robots to learn from people with no machine learning expertise, robots should learn from natural human instruction. Most machine learning techniques that incorporate explanations require people to use a limited vocabulary and provide state information, even if it is not intuitive. This paper discusses a software agent that learned to play the Mario Bros. game using explanations. Our goals to improve learning from explanations were twofold: 1) to filter explanations into advice and warnings and 2) to learn policies from sentences without state information. We used sentiment analysis to filter explanations into advice of what to do and warnings of what to avoid. We developed object-focused advice to represent what actions the agent should take when dealing with objects. A reinforcement learning agent used object-focused advice to learn policies that maximized its reward. After mitigating false negatives, using sentiment as a filter was approximately 85% accurate. object-focused advice performed better than when no advice was given, the agent learned where to apply the advice, and the agent could recover from adversarial advice. We also found the method of interaction should be designed to ease the cognitive load of the human teacher or the advice may be of poor quality.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    65
    Citations
    NaN
    KQI
    []