FPGA-based CNN Processor with Filter-Wise-Optimized Bit Precision

Asuka Maki,Daisuke Miyashita,Kengo Nakata,Fumihiko Tachibana,Tomoya Suzuki,Jun Deguchi

FPGA-based CNN Processor with Filter-Wise-Optimized Bit Precision

2018

Asuka Maki
Daisuke Miyashita
Kengo Nakata
Fumihiko Tachibana
Tomoya Suzuki
Jun Deguchi

Many efforts have been made to improve the efficiency for inference of deep convolutional neural network. To achieve further improvement of the efficiency without penalty of accuracy, we propose filter-wise optimized quantization with variable precision and the hardware architecture that fully supports it; as the bit precision for operations is reduced by granulariy optimizing weight bit precision filter-by-filter, the execution time is reduced proportionally to the total number of computations multiplied with the number of weight bit. We implement the proposed architecture on FPGA and demonstrate that ResNet-50 run with 5.3× less execution cycles without penalty of accuracy.

Keywords:

Real-time computing
Hardware architecture
Quantization (signal processing)
Computation
Field-programmable gate array
Computer science
Convolutional neural network
Convolution
Architecture
Inference
execution time
Computer hardware

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations