Cost-efficient and Skew-aware Data Scheduling for Incremental Learning in 5G Networks

2021 
To facilitate the emerging applications in 5G networks, mobile network operators will provide many network functions in terms of control and prediction. Recently, they have recognized the power of machine learning (ML) and started to explore its potential to facilitate those network functions. Nevertheless, the current ML models for network functions are often derived in an offline manner, which is inefficient due to the excessive overhead for transmitting a huge volume of dataset to remote ML training clouds and failing to provide the incremental learning capability for the continuous model updating. As an alternative solution, we propose Cocktail, an incremental learning framework within a reference 5G network architecture. To achieve cost efficiency while increasing trained model accuracy, an efficient online data scheduling policy is essential. To this end, we formulate an online data scheduling problem to optimize the framework cost while alleviating the data skew issue caused by the capacity heterogeneity of training workers from the long-term perspective. We exploit the stochastic gradient descent to devise an online asymptotically optimal algorithm, including two optimal policies based on novel graph constructions for skew-aware data collection and data training. Small-scale testbed and large-scale simulations validate the superior performance of our proposed framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []