Dual-Precision Acceleration of Convolutional Neural Network Computation with Mixed Input and Output Data Reuse

Shen-Fu Hsiao,Pei-Hsuan Wu,Jien-Min Chen,Kun-Chih Chen

Dual-Precision Acceleration of Convolutional Neural Network Computation with Mixed Input and Output Data Reuse

2019

Memory access dominates power consumption in hardware acceleration of deep neural networks (DNN) computation due to the movement of huge data and weights. This paper design a DNN accelerator using mixed input and output data reuse scheme to achieve balance between internal memory size and memory access amount, two contradictory design goals in resource limited embedded systems. First, analytical forms for memory size and accesses are derived for different data reuse methods in DNN convolution. After comparing the analysis results across different convolutional layers of the VGG-16 model with different levels of hardware parallelism, we implement a low-cost DNN hardware accelerator using mixed input and output data reuse scheme with 32 processing elements (PEs) operating in parallel. Furthermore, the design supports two precision modes (8-bit and 16-bit) allowing variable precision requirements across DNN layers, resulting in more efficient computation compared with single-precision designs through sharing of hardware resource.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations