A Distributional View on Multi-Objective Policy Optimization.

Abbas Abdolmaleki,Sandy H. Huang,Leonard Hasenclever,Michael Neunert,H. Francis Song,Martina Zambelli,Murilo F. Martins,Nicolas Heess,Raia Hadsell,Martin Riedmiller

A Distributional View on Multi-Objective Policy Optimization.

2020

Abbas Abdolmaleki
Sandy H. Huang
Leonard Hasenclever
Michael Neunert
H. Francis Song
Martina Zambelli
Murilo F. Martins
Nicolas Heess
Raia Hadsell
Martin Riedmiller

Many real-world problems require trading off multiple competing objectives. However, these objectives are often in different units and/or scales, which can make it challenging for practitioners to express numerical preferences over objectives in their native units. In this paper we propose a novel algorithm for multi-objective reinforcement learning that enables setting desired preferences for objectives in a scale-invariant way. We propose to learn an action distribution for each objective, and we use supervised learning to fit a parametric policy to a combination of these distributions. We demonstrate the effectiveness of our approach on challenging high-dimensional real and simulated robotics tasks, and show that setting different preferences in our framework allows us to trace out the space of nondominated solutions.

Keywords:

Mathematics
Parametric statistics
Robotics
Supervised learning
Reinforcement learning
Machine learning
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations