A New Continuous-Time Policy Iteration for Time-Varying Nonlinear Systems

2020 
A new optimization method for time-varying nonlinear systems named continuous-time policy iteration (CTPI) method is designed, which is a kind of adaptive critic design (ACD). Iterative control law in CTPI is established to solve the generalized Hamilton-Jacobi-Bellman (HJB) equation in each iteration. In order to avoid the partial differential equation, neural networks are used in CTPI algorithm. The properties of the CTPI are also analyzed. The monotonically non-increasing and convergence of CTPI are proven, which are the main contributions. Finally, numerical results show the CTPI's effectiveness.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []