Differentiable MPC for End-to-end Planning and Control

Brandon Amos,Ivan Dario Jimenez Rodriguez,Jacob Sacks,Byron Boots,J. Zico Kolter

Differentiable MPC for End-to-end Planning and Control

2018

Brandon Amos
Ivan Dario Jimenez Rodriguez
Jacob Sacks
Byron Boots
J. Zico Kolter

In this paper we present foundations for using model predictive control (MPC) as a differentiable policy class in reinforcement learning. Specifically, we differentiate through MPC by using the KKT conditions of the convex approximation at a fixed point of the solver. Using this strategy, we are able to learn the cost and dynamics of a controller via end-to-end learning in a larger system. We empirically show results in an imitation learning setting, demonstrating that we can recover the underlying dynamics and cost more efficiently and reliably than with a generic neural network policy class.

Keywords:

Artificial intelligence
Machine learning
Mathematical optimization
Model predictive control
Control theory
Reinforcement learning
Fixed point
Mathematics
Artificial neural network
Differentiable function
Solver
Karush–Kuhn–Tucker conditions
System identification

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations