Multi-Agent Safe Policy Learning for Power Management of Networked Microgrids.

2019 
This paper presents a multi-agent safe policy learning (PL) method for optimal power management of networked microgrids (MGs) in distribution systems. While conventional reinforcement learning (RL) algorithms are black box decision models that could fail to satisfy grid operational constraints, our proposed method is constrained by AC power flow equations and other operational limits. Accordingly, the training process employs the gradient information of operational constraints to ensure that the optimal control policy functions generate safe and feasible decisions. Furthermore, we have proposed a multi-agent primal-dual consensus-based training approach for the PL solver to maintain the privacy of MGs' control policies in a distributed way. After training, the learned optimal policy functions can be safely used by the MGs to dispatch their local resources, without the need to solve a complex optimization problem from scratch. Numerical experiments have been devised to verify the performance of the proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    4
    Citations
    NaN
    KQI
    []