Self-guided deep deterministic policy gradient with multi-actor
2021
Reinforcement learning algorithms have made huge progress in recent years by leveraging the power of deep neural networks. Despite the success, deep reinforcement learning algorithms’ performance is largely dependent on the approach of exploration. Some of them engage in exploratory behavior by injecting external noise into the action space or adopting a gaussian policy. This paper presents a deep reinforcement learning algorithm without external noise called self-guided deep deterministic policy gradient with multi-actor (SDDPGM), which is the combination of deep deterministic policy gradient and generative adversarial networks (GANs). It employs the generator of GANs which trained from excellent experiences to guide the learning of the agent and makes discriminator constitute a subjective reward. Moreover, to make the learning more stable, SDDPGM applies a multi-actor mechanism that stands as a serially distinct actor based on the temporal phase of an episode. Finally, experiments show that SDDPGM is a promising deep reinforcement learning method.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
33
References
0
Citations
NaN
KQI