LCP: a layer clusters paralleling mapping method for accelerating inception and residual networks on FPGA

2018 
Deep convolutional neural networks (DCNNs) have been widely used in various AI applications. Inception and Residual are two promising structures adopted in many important modern DCNN models, including AlphaGo Zero's model. These structures allow considerably increasing the depth and width of the network to improve accuracy, without increasing the computational budget or the difficulty of convergence. Various accelerators for DCNNs have been proposed based on FPGA platform because it has advantages of high performance, good power efficiency, and fast development round, etc. However, previous FPGA mapping methods cannot fully adapt to the different data localities among layers and other characteristics of Inception and Residual, which leads to a under-utilization of FPGA resources. We propose LCP, a Layer Clusters Paralleling mapping method to classify the layers into clusters based on their differences of parameters and data localities, and then accelerate them in different partitions of FPGA. We evaluate our mapping method by implementing Inception/Residual modules from GoogLeNet [8] and ResNet-50 [4] on Xilinx VC709 (Virtex 690T) FPGA. The results show that the proposed method fully utilizes resources and achieves up to 4.03 × performance than the baseline and 2.00 × performance than the state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    13
    Citations
    NaN
    KQI
    []