A GPU Implementation of the Sparse Deep Neural Network Graph Challenge
2019
This paper presents a CUDA implementation of the latest addition to the Graph Challenge, the inference computation on a collection of large sparse deep neural networks. A single Tesla V100 can compute the inference at 3.7 TeraEdges/s. Using the managed memory API available in CUDA allows for simple and efficient distribution of these computations across a multiGPU NVIDIA DGX-2 server.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
10
References
15
Citations
NaN
KQI