Empirical prior based probabilistic inference neural network for policy learning

2022 
Reinforcement learning is very much democratized for autonomous control of an unknown dynamics system. However, low data efficiency is a practical concern in physical systems. The underlying reason is that, for the current state-of-the-art probabilistic models, it is required to evaluate multiple forecasts before uncertainty estimation, leading to computationally intensive, if not intractable, uncertainty inference. To this gap, probabilistic dynamics is introduced in the current study. A self-consistent condition in terms of empirical prior is formulated to measure uncertainty via inferring parametric distribution, and the resulted neural network is termed as probabilistic inference neural network (PINN). This approach is remarkably effective for data regression and policy learning frameworks. Experiments are performed on UCI datasets. The results demonstrate that PINN performance is more competitive than those of the currently prevailing methods. Finally, a PINN-based policy learning framework is proposed and applied to benchmark control tasks and physical control task of an in-house self-developed 14-DOF robot featuring high dimensional state/action spaces. The results demonstrate highly competitive data efficiency and policy performance to outperform the most prevailing policy learning framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []