A New Subsampling Deep Q Network Method

2020 
In view of the sampling mechanism of experience replay in traditional deep reinforcement learning algorithm, the random sampling mechanism only samples the samples with equal probability without considering the importance of the samples, which may lead to excessive use of samples with low amount of information in the training process. In order to solve this problem, this paper proposes a subsampling mechanism, the first batch of samples is randomly selected, the second calculation of the sampling probability of the sampled samples according to TD error, stratified sampling according to the sampling probability, and then training deep Q networks with the second sampling of samples. The method is applied to the DQN algorithm, and simulation experiments are performed in several environments on the OpenAI Gym platform to verify the effectiveness of the algorithm. The comparison and analysis of simulation experiments show that the subsampling mechanism can improve the quality of training samples, achieve effective approximation of value functions, have good learning efficiency and generalization performance, and significantly improve the convergence speed and training performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []