Transfer restless multi-armed bandit policy for energy-efficient heterogeneous cellular network

2019 
This paper proposes a learning policy to improve the energy efficiency (EE) of heterogeneous cellular networks. The combination of active and inactive base stations (BS) that allows for maximizing EE is identified as a combinatorial learning problem and requires high computational complexity as well as a large signaling overhead. This paper aims at presenting a learning policy that dynamically switches a BS to ON or OFF status in order to follow the traffic load variation during the day. The network traffic load is represented as a Markov decision process, and we propose a modified upper confidence bound algorithm based on restless Markov multi-armed bandit framework for the BS switching operation. Moreover, to cope with initial reward loss and to speed up the convergence of the learning algorithm, the transfer learning concept is adapted to our algorithm in order to benefit from the transferred knowledge observed in historical periods from the same region. Based on our previous work, a convergence theorem is provided for the proposed policy. Extensive simulations demonstrate that the proposed algorithms follow the traffic load variation during the day and contribute to a performance jump-start in EE improvement under various practical traffic load profiles. It also demonstrates that proposed schemes can significantly reduce the total energy consumption of cellular network, e.g., up to 70% potential energy savings based on a real traffic profile.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    51
    References
    2
    Citations
    NaN
    KQI
    []