A Nash Q-Learning based motion decision algorithm with considering interaction to traffic participants

2020 
In order to improve the efficiency and comfort of autonomous vehicles while ensuring safety, the decision algorithm needs to interact with human drivers, infer the most probable behavior and then makes advantageous decision. This paper proposes a Nash-Q learning based motion decision algorithm to consider the interaction. First, the local trajectory of surrounding vehicle is predicted by kinematic constraints, which can reflect the short-term motion trend. Then, the future action space is built based the predicted local trajectory that consists of five basis actions. With that, the Nash-Q learning process can be implemented by the game between these basis actions. By elimination of strictly dominated actions and the Lemke-Howson method, the autonomous vehicle can decide the optimal action and infer the behavior of surrounding vehicle. Finally, the lane merging scenario is built to test the performance contrast to the existing methods. The driver in loop experiment is further designed to verify the interaction performance in multi-vehicle traffic. The results show that the Nash-Q learning based algorithm can improve the efficiency and comfort by 15.75% and 20.71% to the Stackelberg game and the no-interaction method respectively while the safety is ensured. It can also make real-time interaction with human drivers in multi-vehicle traffic.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    3
    Citations
    NaN
    KQI
    []