Qauxi: Cooperative multi-agent reinforcement learning with knowledge transferred from auxiliary task

2022 
, which forms coordinated exploration scheme to improve the traditional MARL algorithms by reusing the meta-experience transferred from liary task. We also use the weighting function to weight the importance of the joint action in monotonic loss function in order to focus on more important joint actions and thus avoid yielding suboptimal policies. Furthermore, we prove the convergence of Qauxi based on contraction mapping theorem. Qauxi is evaluated on the widely adopted StarCraft benchmarks (SMAC) across easy, hard, and super hard scenarios. Experimental results show that the proposed method outperforms the state-of-the-art MARL methods by a large margin in the most challenging super hard scenarios.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []