Provably Efficient Exploration for RL with Unsupervised Learning.

Fei Feng,Ruosong Wang,Wotao Yin,Simon S. Du,Lin F. Yang

Provably Efficient Exploration for RL with Unsupervised Learning.

2020

Fei Feng
Ruosong Wang
Wotao Yin
Simon S. Du
Lin F. Yang

We study how to use unsupervised learning for efficient exploration in reinforcement learning with rich observations generated from a small number of latent states. We present a novel algorithmic framework that is built upon two components: an unsupervised learning algorithm and a no-regret reinforcement learning algorithm. We show that our algorithm provably finds a near-optimal policy with sample complexity polynomial in the number of latent states, which is significantly smaller than the number of possible observations. Our result gives theoretical justification to the prevailing paradigm of using unsupervised learning for efficient exploration [tang2017exploration,bellemare2016unifying].

Keywords:

reinforcement learning algorithm
Reinforcement learning
Artificial intelligence
Polynomial
Small number
Unsupervised learning
unsupervised learning algorithm
Machine learning
Mathematics
Sample complexity

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations