Self-Learning Control Using Dual Heuristic Programming with Global Laplacian Eigenmaps

2017 
In this paper, to solve nonlinear optimal control problems which can be modeled as Markov decision processes (MDPs), we present an online self-learning control algorithm called dual heuristic programming with global Laplacian Eigenmaps (GLEM-DHP). The GLEM-DHP algorithm makes use of GLEM, which is an improved manifold learning approach with global information, to learn the features for value function approximation of the MDP. Different from traditional feature representation methods using neural networks, the manifold-based features can be learned before the online learning process by collecting samples from the MDP. More importantly, in addition to local features, global information can also be utilized by using the geodesic minimum spanning tree (GMST) approach. Based on the theoretical property of GMST, it is shown that the GLEM-based features can represent the intrinsic geometric property of MDP states, which is beneficial to improve the performance of value function approximation and, hence, leads to better learning control properties. To compare the proposed method with previous learning control algorithms, the performance of GLEM-DHP is evaluated on two nonlinear control problems, which include the cart-pole problem and the ball-plate control problem. Simulation and experimental results show that the GLEM-DHP algorithm can obtain better learning control performance than previous learning control algorithms with manually designed features, as well as manifold features only with local information.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    11
    Citations
    NaN
    KQI
    []