Adaptive and Robust Network Routing Based on Deep Reinforcement Learning with Lyapunov Optimization

2020 
The most recent development of the Internet of Things brings massive timely-sensitive and yet bursty data flows. The adaptive network control has been explored using deep reinforcement learning, but it is not sufficient for extremely bursty network traffic flows, especially when the network traffic pattern may change over time. We model the routing control in an environment with time-variant link delays as a Lyapunov optimization problem. We identify that there is a tradeoff between optimization performance and modeling accuracy when the propagation delays are included. We propose a novel deep reinforcement learning-based adaptive network routing method to tackle the issues mentioned above. A Lyapunov optimization technique is used to reduce the upper bound of the Lyapunov drift, which leads to improved queuing stability in networked systems. Experiment results show that the proposed method can learn a routing control policy and adapt to the changing environment. The proposed method outperforms the baseline backpressure method in multiple settings, and converges faster than existing methods. Moreover, the deep reinforcement learning module can effectively learn a better estimation of the longterm Lyapunov drift and penalty functions, and thus it provides superior results in terms of the backlog size, end-to-end latency, age of information, and throughput. Extensive experiments also show that the proposed model performs well under various topologies, and thus the proposed model can be used in general cases. Also the user can adjust the preference parameter at ant time without the need to retrain the neural networks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []