Designing Reinforcement Learning Agents for Traffic Signal Control with the Right Goals: a Time-Loss based Approach

2021 
In recent years, several studies proposed different Reinforcement Learning (RL) methods and formulations for the Traffic Signal Control problem, presenting promising results and flexible traffic light solutions. These studies generally decide for optimizing travel time as the objective. However, travel time has some crucial shortcomings: it is not easily decomposable into rewards; it hinders analysis at any simulation time but the very end; it produces unrealistic results for deadlock and starvation situations. In this paper, we propose two versions of objectives based on time loss, namely instant time loss per driver and consolidated time loss per driver, to address travel time shortcomings. We also show that improving time loss implies improving travel time and that there is a direct relationship between the time loss objective and the agents' reward. Our experimental results point out that our time loss-based RL formulation improves the time savings by 6% when compared to other commonly-adopted state-of-the-art formulations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []