An integrated MPC and deep reinforcement learning approach to trams-priority active signal control

2021 
Abstract The problem of active signal priority control for trams is investigated. A combined model predictive control (MPC) and deep reinforcement learning solution is proposed, to minimize stopping of trams at intersections while reducing delay of general vehicles. An efficient new deep reinforcement learning (DRL) framework is introduced to improve the proximal policy optimization with model-based acceleration (PPOMA). The DRL module is strengthened by a model predictive controller, which provides low-precision prediction of the real-time traffic dynamics to improve the learning performance. The problem is modeled as a high-dimension Markov decision process. Dynamic phase sequence is used to improve the flexibility of signal priority control, instead of only optimizing a signal cycle in a fixed phase sequence as in other methods. The optimal traffic signal sequence is obtained by using real-time traffic information collected from vehicular networks. Experiments with SUMO have shown the advantage of our method in comparison with the existing methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    2
    Citations
    NaN
    KQI
    []