Norms, Rewards, and the Intentional Stance: Comparing Machine Learning Approaches to Ethical Training

2018 
The challenge of training AI systems to perform responsibly and beneficially has inspired different approaches for teaching a system what people want and how it is acceptable to attain that in the world. In this paper we compare work in reinforcement learning, in particular inverse reinforcement learning, with our norm inference approach. We test those two systems and present results. Using the idea of the "intentional stance", we explain how a norm inference approach can work even when another agent is acting strictly according to reward functions. In this way norm inference presents itself as a promising, more explicitly accountable approach with which to design AI systems from the start.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    7
    Citations
    NaN
    KQI
    []