Differentiable MPC for End-to-end Planning and Control
2018
In this paper we present foundations for using model predictive control (MPC) as a differentiable policy class in reinforcement learning. Specifically, we differentiate through MPC by using the KKT conditions of the convex approximation at a fixed point of the solver. Using this strategy, we are able to learn the cost and dynamics of a controller via end-to-end learning in a larger system. We empirically show results in an imitation learning setting, demonstrating that we can recover the underlying dynamics and cost more efficiently and reliably than with a generic neural network policy class.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
74
References
52
Citations
NaN
KQI