A Novel Convolutional Neural Network Accelerator That Enables Fully-Pipelined Execution of Layers

2019 
In this paper, we propose a novel CNN accelerator, called MIDAP, aiming to maximize the utilization of MAC units by enabling fully-pipelined execution of layers. To this end, MIDAP adopts two-level pipelining, macro pipelining and micro pipelining, and large on-chip SRAMs. The macro pipeline consists of three modules for convolution, activation, and pooling layer and each module accesses separate memory units without access conflict. For micro-pipelining inside the convolution module, the datapath is designed to be free from dynamic resource contention. Also, the inter-layer feature map reuse is maximized by compile-time analysis. From the simulation results on 1GHz frequency, MIDAP shows the remarkable end-to-end performance of several well-known CNNs: for instance, 144 fps for Inception V3 model and 892 fps for Mobilenet V1 model with 1024 MACs. A high-level synthesis result reveals that our accelerator is able to achieve about 2.0 TOPs/W with a small area less than 2mm^2 with 8nm CMOS technology, thanks to its simple datapath.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    5
    Citations
    NaN
    KQI
    []