A Highly Parallel FPGA Implementation of Sparse Neural Network Training

Sourya Dey,Diandian Chen,Zongyang Li,Souvik Kundu,Kuan-Wen Huang,Keith M. Chugg,Peter A. Beerel

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

2018

Sourya Dey
Diandian Chen
Zongyang Li
Souvik Kundu
Kuan-Wen Huang
Keith M. Chugg
Peter A. Beerel

This paper describes the development of an FPGA implementation of a parallel and reconfigurable architecture for sparse neural networks, capable of on-chip training and inference. The network connectivity uses pre-determined, structured sparsity to significantly reduce complexity by lowering memory and computational requirements. The architecture uses a notion of edge-processing, leading to efficient pipelining and parallelization. Moreover, the device can be reconfigured to trade off resource utilization with training time to fit networks and datasets of varying sizes. The combined effects of complexity reduction and easy reconfigurability enable greater exploration of network hyperparameters and structures on-chip. As proof of concept, we show implementation results on an Artix-7 FPGA.

Keywords:

Field-programmable gate array
Proof of concept
Computer science
Computer architecture
Distributed computing
Artificial neural network
Inference
Architecture
Reduction (complexity)
Pipeline (computing)
Reconfigurability
Hyperparameter

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations