An FPGA-based MobileNet Accelerator Considering Network Structure Characteristics

2021 
Convolutional neural networks (CNNs) have been widely deployed in computer vision tasks. However, the computation and resource intensive characteristics of CNN bring obstacles to its application on embedded systems. MobileNet, as a representative of compact models, can reduce the amount of parameters and computation. A high-performance inference accelerator on FPGA for MobileNet is proposed in this paper. With respect to the three types of convolution operations, multiple parallel strategies are exploited and the corresponding hardware structures such as input buffer and configurable adder tree are designed. With respect to the bottleneck block, a dedicated architecture is proposed to reduce data transmission time. In addition, a hardware padding scheme to improve the efficiency of padding is proposed. The accelerator implemented on Virtex-7 FPGA reaches 70.8% Top-1 accuracy under 8-bit quantization. The accelerator achieves 302.3 FPS and 181.8 GOPS, which obtains 22.7x, 3.9x and 1.4x speedup compared to the implementations in Snapdragon 821 CPU, i7-6700HQ CPU and GTX 960M GPU, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    1
    Citations
    NaN
    KQI
    []