9.2A 28nm 12.1TOPS/W Dual-Mode CNN Processor Using Effective-Weight-Based Convolution and Error-Compensation-Based Prediction

2021 
To deploy convolutional neural networks (CNNs) on edge devices efficiently, most existing CNN processors were built on quantized CNNs to optimize the inference operations. However, three issues (Fig. 9.2.1) have not been well addressed: 1) Duplicate weights in each kernel after quantization yielding repetitive multiplications; 2) a huge number of unnecessary MACs caused by ReLU activation functions; 3) frequent off-chip memory access in residual blocks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []