Multi-agent reinforcement learning by the actor-critic model with an attention interface

2021 
Abstract Multi-agent reinforcement learning algorithms have achieved satisfactory performances in various scenarios, but many of them encounter difficulties in partially observable environments. In partially observable environments, the inability to perceive environment states results in unsteadiness and misconvergence, especially in large-scale multi-agent environments. To improve interactions among homogeneous agents in a partially observable environment, we propose a novel multi-agent actor-critic model with a visual attention interface to solve this problem. First, a recurrent visual attention interface is used to extract a latent state from each agent’s partial observation. These latent states allow agents to focus on several local environments, in which each agent has a complete perception of a local environment and the intricate multi-agent environment is teased out by the interaction among several agents in the same local environment. The proposed method trains multi-agent systems with a centralized training and decentralized execution mechanism. The joint action of agents is approximated by the mean-field theory because the number of agents in a local environment is uncertain. Experimental results on the simulation platform suggest that our model performs better when training large-scale multi-agent systems in partially observable environments than baselines.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    2
    Citations
    NaN
    KQI
    []