Cooperative Convolutional Neural Network Deployment over Mobile Networks

2020 
Inference acceleration has drawn much attention to cope with the real-time requirement of artificial intelligence (AI) applications. To this end, model partition for Deep Neural Networks (DNN) has been proposed to utilize the parallel and distributed computing units. However, the previous works focus on the load balancing among servers but may overlook the interplay between the computing and communication. This issue makes the existing approaches less efficient especially in mobile edge networks at which smart devices usually with limited computing capacity have to offload the tasks via limited bandwidth capacity to nearby servers. In this paper, therefore, we innovate a new system and formulate a new optimization problem, CONVENE, to minimize the completion time of inference for the smart devices with one or more antennas. To explore the intrinsic properties, we first study CONVENE with Single Antenna and derive an algorithm termed THREAD-SA to foster the optimum solution. Then, an extension, THREAD, is proposed to subtly utilize multiple antennas to further reduce completion time. Simulation results manifest that our algorithm outperforms others by 100%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    3
    Citations
    NaN
    KQI
    []