Energy-Aware Workload Allocation for Distributed Deep Neural Networks in Edge-Cloud Continuum

2019 
This paper presents an energy-aware workload allocation framework for Distributed Deep Neural Networks (DNNs) in the Edge-Cloud continuum. As opposed to conventional approaches where the inference is performed in a standalone device, a computing-communication mode is proposed to distribute computing tasks of different layers of DNNs to different levels of the Edge-Cloud network to achieve the minimum energy cost per inference. The optimal exit layer (EL) can be determined where the intermediate data of the neural networks are transmitted to the higher level in the Edge-Cloud continuum. Case studies are illustrated for AlexNet and VGG-16 considering a set of DNN processors and wireless interfaces. Using the GPU GTX1080 with 22.8 GOPS/W and the WiFi with 10 nJ/bit transmission efficiency, the optimized energy consumption for AlexNet is estimated to be 0.016 J when the inference exits from the edge at the EL2 (Conv1) layer. For VGG-16, the optimal EL is EL1 with the minimum inference cost of 0.0482 J.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    2
    Citations
    NaN
    KQI
    []