Training Acceleration for Deep Neural Networks: A Hybrid Parallelization Strategy

2021 
Deep Neural Networks (DNNs) are widely investigated due to their striking performance in various applications of artificial intelligence. However, with DNNs becoming larger and deeper, the computing resource of a single hardware accelerator is insufficient to meet the training requirements of popular DNNs. Hence, it is required to train them using multiple accelerators in a distributed setting. For a better utilization of the accelerators and a faster training, it is necessary to partition the whole process into segments that can run in parallel. However, in this context, intra-layer parallelization techniques (i.e., data and model parallelization) often face communication and memory bottlenecks, while the performance and resource utilization of inter-layer parallelization techniques (i.e., using pipelining) depend on the partitioning possibilities of the model. We present EffTra, a synchronous hybrid parallelization strategy, that uses a combination of intra-layer and inter-layer parallelism to realize a distributed training of DNNs. EffTra employs the idea of dynamic programming to try to search for the optimal partitioning of a DNN model and assigns devices to the obtained partitions. Our evaluation shows that EffTra accelerates training by up to 2.0x and 1.78x compared to state-of-the-art inter-layer (i.e., GPipe) and intra-layer (i.e., data parallelism) parallelization techniques respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []