Accelerate deep Q-network learning by n-step backup

2018 
Reinforcement learning is notorious for being difficult to train and for its prolonged training time. In this paper, we propose a new n-step backup methods to accelerate deep Q-network learning, which results in faster and better convergence. Although it is proven that n-step backup does not perform well in off-policy experience replay, we obtain improved performance by discovering and solving a number of problems. Specifically, we reduce the bias through n -step backup while avoiding the accompanied larger variance and higher computation overhead. Secondly, we attack the deteriorated overestimation problem in n-step backup, while enjoying faster propagation. We evaluate the performance of our approach in the Arcade Learning Environment.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []