Multi-UCAV Air Combat in Short-Range Maneuver Strategy Generation using Reinforcement Learning and Curriculum Learning

2020 
We present an approach for learning a reactive maneuver strategy for a UCAV formation involved in a short-range multi-UCAV air combat engagement. Specifically, we define an efficient state representation, which breaks down the complexity caused by the large state space in a multi-UCAV air combat engagement. Then a parameter sharing dueling deep Q-network (PS-DDQN) algorithm is proposed to train the UCAV formation. The learning reactive maneuver strategy is shared among our UCAVs to encourage cooperative behaviors. In addition, curriculum learning and self-play extend the maneuver strategy to more difficult scenarios. Thus, speeding up the training process and improving the learning effect. Finally, the effectiveness of the algorithm and the intelligence degree of maneuver strategy is verified by the simulation test of convergence and maneuver strategy quality.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    0
    Citations
    NaN
    KQI
    []