Selective Transition Collection in Experience Replay

Feng Liu,Shuling Dai,Yongjia Zhao

Selective Transition Collection in Experience Replay

2021

Feng Liu
Shuling Dai
Yongjia Zhao

Experience replay method is often used in off-policy reinforcement learning. As the training progresses, the distribution of the collected transitions becomes more and more concentrated, and this will lead to catastrophic forgetting and a low rate of convergence. In this paper, we present selective transition collection algorithm which is a new design to address the concentrated distribution by selectively collection the transitions. We propose a method to estimate the similarity between transitions, and a probability function to reduce the chance of transitions with high similarity to the experience memory being collected. We test our method on familiar reinforcement learning tasks and the experimental results demonstrate that selective transition collection can not only speed up the learning but also prevent catastrophic forgetting effectively.

Keywords:

Condensed matter physics
Computer science
transition

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations