A Reinforcement Learning-based solution for Intra-domain Egress Selection

2021 
An ingress router often has multiple potential egress points in an extensive network where it can transmit traffic to external networks. The traditional solution is choosing the closest node (with the shortest path) to the ingress node. This paper claims the drawbacks of this approach in a flexible network system and introduces our proposal called MAB-based Egress Selection. Our approach uses several Reinforcement Learning techniques, which are commonly used to resolve Multi-Armed Bandit (MAB) problem, to allow the ingress router to periodically re-pick egress point, hence optimize the long-term performance of traffic transmission. To formalize the egress selection process as a MAB problem, we use a combined score of delay and loss representing link status as a reward. However, capturing those network metrics encounters some issues due to the distributed control and restricted local view of network nodes. For this purpose, a centralized control architecture, e.g., Software-defined Network (SDN), is a promising candidate. We applied four common algorithms, ϵ-greedy, Softmax, UCB1 and Single Pull UCB2 (SP-UCB2) for egress selection process. The models are evaluated in two simulated network topologies with different scenarios of network traffic condition. The experimental results show that the UCB algorithms produce the best performance, especially in busy network.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []