Moral Gridworlds: A Theoretical Proposal for Modeling Artificial Moral Cognition

Julia Haas

Moral Gridworlds: A Theoretical Proposal for Modeling Artificial Moral Cognition

2020

Julia Haas

I describe a suite of reinforcement learning environments in which artificial agents learn to value and respond to moral content and contexts. I illustrate the core principles of the framework by characterizing one such environment, or “gridworld,” in which an agent learns to trade-off between monetary profit and fair dealing, as applied in a standard behavioral economic paradigm. I then highlight the core technical and philosophical advantages of the learning approach for modeling moral cognition, and for addressing the so-called value alignment problem in AI.

Keywords:

Cognitive science
moral cognition
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

110

References

Citations