In Search of the Performance- and Energy-Efficient CNN Accelerators

2021 
In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it was shown that proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO and SSD.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []