Analyzing the distributed training of deep-learning models via data locality

2021 
In the last few years, deep-learning models are becoming crucial for numerous scientific and industrial applications. Due to the growth and complexity of deep neural networks, researchers have been investigating techniques to train those networks more efficiently. Many efforts have been made to optimize deep-learning models by parallelizing or distributing their training computation across multiple devices. Current state-of-the-art techniques, such as Horovod, have shown to maximize the performance of both the training computation and the inter-node communication of models for different deep-learning frameworks. However, some applications cannot take advantage of the above techniques due to an I/O bottleneck caused by the input data, thus limiting the scalability of the trainings. In this paper, we study an approach based on data locality - that has not been fully studied yet - for those neural networks that cannot benefit from scaling their computation due to a significant bottleneck in the data I/O.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    0
    Citations
    NaN
    KQI
    []