FPGA-based CNN Processor with Filter-Wise-Optimized Bit Precision

2018 
Many efforts have been made to improve the efficiency for inference of deep convolutional neural network. To achieve further improvement of the efficiency without penalty of accuracy, we propose filter-wise optimized quantization with variable precision and the hardware architecture that fully supports it; as the bit precision for operations is reduced by granulariy optimizing weight bit precision filter-by-filter, the execution time is reduced proportionally to the total number of computations multiplied with the number of weight bit. We implement the proposed architecture on FPGA and demonstrate that ResNet-50 run with 5.3× less execution cycles without penalty of accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    5
    Citations
    NaN
    KQI
    []