Asynchronous Methods for Multi-agent Deep Deterministic Policy Gradient

Xuesong Jiang,Zhipeng Li,Xiumei Wei

Asynchronous Methods for Multi-agent Deep Deterministic Policy Gradient

2018

We propose a variant framework for optimizing the deep neural network controller using asynchronous gradient descent method for the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. Using CPU’s multicore to create multiple parallel environments, each thread interacts with its own environment replica. Each copy uses prioritized batch data. The evaluation method of Critic was adjusted, and advantage was used as the evaluation of action. The batch data processed by multiple copies is collected and the loss values of each copy are calculated. Using batch data with maximum loss as sampling for global network. In addition, we show the successful application of multi-agent collaboration based on asynchronous methods. The results show that the mean episode reward is higher than the reward obtained by previous algorithm.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations