Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts

Michael Gimelfarb,Scott Sanner,Chi-Guhn Lee

Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts

2021

In reinforcement learning, agents that consider the context, or current state, when selecting source policies for transfer have been shown to outperform context-free approaches. However, existing approaches suffer from limitations, including sensitivity to sparse or delayed rewards and estimation errors in value functions. One important insight is that explicit learned models of the source dynamics, when available, could benefit contextual transfer in such settings. In this paper, we assume a family of tasks with shared sub-goals but different dynamics, and availability of estimated dynamics and policies for source tasks. To deal with possible estimation errors in dynamics, we introduce a novel Bayesian mixture-of-experts for learning state-dependent beliefs over source task dynamics that match the target dynamics using state transitions collected from the target task. The mixture is easy to interpret, demonstrates robustness to estimation errors in dynamics, and is compatible with most learning algorithms. We incorporate it into standard policy reuse frameworks and demonstrate its effectiveness on benchmarks from OpenAI gym.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations