Learning to Schedule Joint Radar-Communication with Deep Multi-Agent Reinforcement Learning

2021 
Radar detection and communication are two of several sub-tasks essential for the operation of next-generation autonomous vehicles (AVs). The former is required for sensing and perception, more frequently so under various unfavorable environmental conditions such as heavy precipitation; the latter is needed to transmit time-critical data. Forthcoming proliferation of faster 5G networks utilizing mmWave is likely to lead to interference with automotive radar sensors, which has led to a body of research on the development of Joint Radar Communication (JRC) systems and solutions. This paper considers the problem of time-sharing for JRC, with the additional simultaneous objective of minimizing the average age of information (AoI) transmitted by a JRC-equipped AV. We first formulate the problem as a Markov Decision Process (MDP). We then propose a more general multi-agent system, with an appropriate medium access control protocol (MAC), which is formulated as a partially observed Markov game (POMG). To solve the POMG, we propose a multi-agent extension of the Proximal Policy Optimization (PPO) algorithm, along with algorithmic features to enhance learning from raw observations. Simulations are run with a range of environmental parameters to mimic variations in real-world operation. The results show that the chosen deep reinforcement learning methods allow the agents to obtain good results with minimal a priori knowledge about the environment.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []