CuMF_SGD: Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUs

Xiaolong Xie,Wei Tan,Liana L. Fong,Yun Liang

CuMF_SGD: Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUs

2017

Stochastic gradient descent (SGD) is widely used by many machine learning algorithms. It is efficient for big data ap- plications due to its low algorithmic complexity. SGD is inherently serial and its parallelization is not trivial. How to parallelize SGD on many-core architectures (e.g. GPUs) for high efficiency is a big challenge. In this paper, we present cuMF_SGD, a parallelized SGD solution for matrix factorization on GPUs. We first design high-performance GPU computation kernels that accelerate individual SGD updates by exploiting model parallelism. We then design efficient schemes that parallelize SGD updates by exploiting data parallelism. Finally, we scale cuMF SGD to large data sets that cannot fit into one GPU's memory. Evaluations on three public data sets show that cuMF_SGD outperforms existing solutions, including a 64- node CPU system, by a large margin using only one GPU card.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations