Model-based Ensemble Reinforcement Learning with Soft Proximal Policy Optimization

Dazi Li,Fuqiang Zhu

Model-based Ensemble Reinforcement Learning with Soft Proximal Policy Optimization

2021

At present, model-free reinforcement learning has been widely used in games, robot control and other fields, and has achieved good results. However, these algorithms require a large number of samples to achieve good performance, which limits the application of model-free reinforcement learning methods in real-world domains. Model-based reinforcement learning can use fewer samples, but requires careful tuning. In this article, we use a neural network ensemble model to learn the dynamics, and incorporate model predictive control as the basic control framework. The dynamic model is also used to train an initial model-free neural network to achieve a combination of sampling efficiency and performance. We evaluated our method on the MuJoCo experimental platform. The results show that, compared with other model-free and model-based methods, our approach achieves better task performance while having excellent sample efficiency.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations