A GPU memory efficient speed-up scheme for training ultra-deep neural networks: poster

Jinrong Guo,Wantao Liu,Wang Wang,Qu Lu,Songlin Hu,Jizhong Han,Ruixuan Li

A GPU memory efficient speed-up scheme for training ultra-deep neural networks: poster

2019

Ultra-deep neural network(UDNN) tends to yield higher-quality model but its training process is often difficult to handle. Scarce GPU DRAM capacity is the primary bottleneck that limits the depth of neural network and the range of trainable minibatch size. In this paper, we present a scheme that dedicates to make the utmost use of finite GPU memory resource to speed up the training process for UDNN. Firstly, a performance-model guided dynamic swap out/in strategy between GPU and host memory is carefully orchestrated to tackle the out-of-memory problem without introducing performance penalty. Then, a hyperparameter (minibatch size, learning rate) tuning policy is designed to explore the optimal configuration after applying the swap strategy from the perspectives of training time and final accuracy simultaneously. Finally, we verify the effectiveness of our scheme in both single and distributed GPU mode.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations