Deep Reinforcement Learning for Resource Allocation in Multi-platoon Vehicular Networks

2021 
Grouping vehicles into different platoons is a promising cooperative driving application to enhance the traffic safety and traffic capacity of future vehicular networks. However, fast-changing channel conditions in high mobility multi-platoon vehicular networks cause tremendous uncertainty for resource allocation. Moreover, the increasing popularity of various emerging vehicle-to-infrastructure (V2I) applications may results in some service demands with conflicting quality of experience. In this paper, we formulate a multi-objective resource allocation problem, which maximizes the transmission success rate of intra-platoon communications and the capacity of V2I communications. To efficiently solve this problem, we formulate the long-term resource allocation problem as a partially observable stochastic game, where each platoon acts as an agent and each resource allocation solution corresponds to an action taken by the platoon. Then a Contribution-based Parallel Proximal Policy Optimization (CP-PPO) method is employed so that each agent learns subchannel selection and power allocation strategies in a distributed manner. In addition, we propose a deep reinforcement learning (DRL) based framework to achieve a good tradeoff in the multi-objective problem. Under appropriate reward design and training mechanism, extensive simulation results demonstrate the significant performance superiority of our proposed method over other methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []