Generative Adversarial Imitation Learning

Jonathan Ho,Stefano Ermon

Generative Adversarial Imitation Learning

2016

Jonathan Ho
Stefano Ermon

Consider learning a policy from example expert behavior, without interaction with the expert or access to a reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.

Keywords:

Semi-supervised learning
Artificial intelligence
Computer science
Active learning (machine learning)
Machine learning
Robot learning
Reinforcement learning
Learning classifier system
Temporal difference learning
Unsupervised learning
Cognitive imitation
Q-learning

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

595

Citations