Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning

2020 
Packet routing is one of the fundamental problems in computer networks in which a router determines the next-hop of each packet in the queue to get it as quickly as possible to its destination. Reinforcement learning (RL) has been introduced to design autonomous packet routing policies with local information of stochastic packet arrival and service. However, the curse of dimensionality of RL prohibits the more comprehensive representation of dynamic network states, thus limiting its potential benefit. In this article, we propose a novel packet routing framework based on multiagent deep RL (DRL) in which each router possess an independent long short term memory (LSTM) recurrent neural network (RNN) for training and decision making in a fully distributed environment. The LSTM RNN extracts routing features from rich information regarding backlogged packets and past actions, and effectively approximates the value function of Q-learning. We further allow each route to communicate periodically with direct neighbors so that a broader view of network state can be incorporated. The experimental results manifest that our multiagent DRL policy can strike the delicate balance between congestion-aware and shortest routes, and significantly reduce the packet delivery time in general network topologies compared with its counterparts.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    15
    Citations
    NaN
    KQI
    []