A GPU Implementation of the Sparse Deep Neural Network Graph Challenge

Mauro Bisson,Massimiliano Fatica

A GPU Implementation of the Sparse Deep Neural Network Graph Challenge

2019

Mauro Bisson
Massimiliano Fatica

This paper presents a CUDA implementation of the latest addition to the Graph Challenge, the inference computation on a collection of large sparse deep neural networks. A single Tesla V100 can compute the inference at 3.7 TeraEdges/s. Using the managed memory API available in CUDA allows for simple and efficient distribution of these computations across a multiGPU NVIDIA DGX-2 server.

Keywords:

Theoretical computer science
Graph
Artificial neural network
Computer science
Computation
deep neural networks
CUDA
Parallel computing
Inference

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations