Model-Distributed DNN Training for Memory-Constrained Edge Computing Devices
2021
We consider a model-distributed learning framework in which layers of a deep learning model is distributed across multiple workers. To achieve consistent gradient updates during the training phase, model-distributed learning requires the storage of multiple versions of the layer parameters at every worker. In this paper, we design mcPipe to reduce the memory cost of model-distributed learning, which is crucial in memory-constrained edge computing devices. mcPipe uses an on-demand weight updating policy, which reduces the amount of weights that should be stored at workers. We analyze the memory cost of mcPipe and demonstrate its superior performance as compared to existing model-distributed learning mechanisms. We implement mcPipe in a real testbed and show that it improves the memory cost without hurting converge rate and computation cost.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
21
References
1
Citations
NaN
KQI