Deep Exploration via Bootstrapped DQN

Ian Osband,Charles Blundell,Alexander Pritzel,Benjamin Van Roy

Deep Exploration via Bootstrapped DQN

2016

Ian Osband
Charles Blundell
Alexander Pritzel
Benjamin Van Roy

Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and statistically efficient manner through use of randomized value functions. Unlike dithering strategies such as epsilon-greedy exploration, bootstrapped DQN carries out temporally-extended (or deep) exploration; this can lead to exponentially faster learning. We demonstrate these benefits in complex stochastic MDPs and in the large-scale Arcade Learning Environment. Bootstrapped DQN substantially improves learning times and performance across most Atari games.

Keywords:

SIMPLE algorithm
Reinforcement learning
Learning environment
Machine learning
Dither
Artificial intelligence
Bootstrapping
Mathematics

Correction
Cite
Save
Machine Reading By IdeaReader

References

Citations