Massively Parallel Methods for Deep Reinforcement Learning

Arun Nair,Praveen Srinivasan,Sam Blackwell,Cagdas Alcicek,Rory Fearon,Alessandro De Maria,Vedavyas Panneershelvam,Mustafa Suleyman,Charles Beattie,Stig Petersen,Shane Legg,Volodymyr Mnih,Koray Kavukcuoglu,David Silver

Massively Parallel Methods for Deep Reinforcement Learning

2015

Arun Nair
Praveen Srinivasan
Sam Blackwell
Cagdas Alcicek
Rory Fearon
Alessandro De Maria
Vedavyas Panneershelvam
Mustafa Suleyman
Charles Beattie
Stig Petersen
Shane Legg
Volodymyr Mnih
Koray Kavukcuoglu
David Silver

We present the first massively distributed architecture for deep reinforcement learning. This architecture uses four main components: parallel actors that generate new behaviour; parallel learners that are trained from stored experience; a distributed neural network to represent the value function or behaviour policy; and a distributed store of experience. We used our architecture to implement the Deep Q-Network algorithm (DQN). Our distributed algorithm was applied to 49 games from Atari 2600 games from the Arcade Learning Environment, using identical hyperparameters. Our performance surpassed non-distributed DQN in 41 of the 49 games and also reduced the wall-time required to achieve these results by an order of magnitude on most games.

Keywords:

Artificial intelligence
Machine learning
Massively parallel
Architecture
Reinforcement learning
Learning environment
Artificial neural network
Distributed computing
Distributed algorithm
Computer science
Hyperparameter
Bellman equation

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

347

Citations