Benchmark of the Compute-in-Memory-Based DNN Accelerator With Area Constraint

2020 
Compute-in-memory (CIM) is a promising computing paradigm to accelerate the inference of deep neural network (DNN) algorithms due to its high processing parallelism and energy efficiency. Prior CIM-based DNN accelerators mostly consider full custom design, which assumes that all the weights are stored on-chip. For lightweight smart edge devices, this assumption may not hold. In this article, CIM-based DNN accelerators are designed and benchmarked under different chip area constraints. First, a scheduling strategy and dataflow for DNN inference is investigated when only part of the weights can be stored on-chip. Two weight reload schemes are evaluated: 1) reload partial weights and reuse input/output feature maps and 2) load a batch of input and reuse the partial weights on-chip across the batch. Then, system-level performance benchmark is performed for the inference of ResNet-18 on ImageNet data set. The design tradeoffs with different area constraints, dataflow, and device technologies [static random access memory (SRAM) versus ferroelectric field-effect transistor (FeFET)] are discussed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    3
    Citations
    NaN
    KQI
    []