Dynamic control of functional splits for energy harvesting virtual small cells: A distributed reinforcement learning approach

2019 
To meet the growing mobile data traffic demand, Mobile Network Operators (MNOs) are deploying dense infrastructures of small cells as a solution for capacity enhancement. This densification increases the power consumption of mobile networks, thus impacting the environment. As a result, we have seen a recent trend of powering base stations with ambient energy sources to achieve both environmental sustainability and cost reductions. In addition, flexible functional split in Cloud Radio Access Network (CRAN) is a promising solution to overcome the capacity and latency challenges in the fronthaul. In such architecture, local base stations perform partial baseband processing while the remaining part will take place at the central cloud. As the cells become smaller and deployed in a densified manner, it is evident that baseband processing power consumption has a huge share in the total base station power consumption breakdown. In this paper, we propose a network scenario where the baseband processes of the virtual small cells powered solely by energy harvesters and batteries can be opportunistically executed in a grid-connected edge computing server, co-located at the macro base station site. We state the corresponding energy minimization problem and propose multi-agent Reinforcement Learning (RL) to solve it. Distributed Fuzzy Q-Learning and Q-Learning on-line algorithms are tailored for our purposes. Coordination among the multiple agents is favored by broadcasting system level information to the independent learners. The evaluation of the network performance confirms that favoring coordination among the agents via broadcasting may achieve higher system level gains and cumulative rewards closer to the off-line bounds than solutions that are unaware of system level information. Finally, our analysis permits to evaluate the benefits of continuous state/action representation for the learning algorithms in terms of faster convergence, higher cumulative reward and adaptivity to changing environments.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []