Data-driven Rollout for Deterministic Optimal Control.

2021 
We consider deterministic infinite horizon optimal control problems with nonnegative stage costs. We draw inspiration from learning model predictive control scheme designed for continuous dynamics and iterative tasks, and propose a rollout algorithm that relies on sampled data generated by some base policy. Based upon value and policy iteration ideas, we show that the proposed algorithm applies to deterministic problems with arbitrary state and control spaces, arbitrary dynamics, and admits extensions to problems with trajectory constraints, and applications to the exploration of state space, and problems with multiagent structure with such flexibility. The validity of our assertions are demonstrated through various examples.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    0
    Citations
    NaN
    KQI
    []