Time-domain neural network: A 48.5 TSOp/s/W neuromorphic chip optimized for deep learning and CMOS technology

2016 
Demand for highly energy-efficient hardware for the inference computation of deep neural networks is increasing. Ultimately, fully spatially unrolled architecture where each distributed weight memory has a processing element (PE) for its exclusive use is the most energy-efficient solution because i) it can completely eliminate the energy-hungry data moving for weight fetching, and ii) PEs can consist only of combinational logics generally consuming less power than flip-flops. However, this strategy has not been applied because it requires a prohibitively huge amount of both area and hardware resources. We propose TDNN, which enables the fully spatially unrolled architecture by using 3D stacked ReRAM and the time-domain analog-digital mixed-signal processing that uses delay time as signal. In TDNN, a PE that performs synaptic operation is composed of only 12 logic transistors, which are equivalent to 3 gates. The proof-of-concept chip with SRAM instead of ReRAM shows unprecedentedly high energy efficiency of 48.2 TSop/s/W.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    13
    Citations
    NaN
    KQI
    []