Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator

Matthew J. Marinella,Sapan Agarwal,Alexander H. Hsia,Isaac Richter,Robin B. Jacobs-Gedrim,John Niroula,Steven J. Plimpton,Engin Ipek,Conrad D. James

Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator

2018

Neural networks are an increasingly attractive algorithm for natural language processing and pattern recognition. Deep networks with >50 M parameters are made possible by modern graphics processing unit clusters operating at $270\times $ energy and $540\times $ latency advantage over a similar block utilizing only digital ReRAM and takes only 11 fJ per multiply and accumulate. Compared with an SRAM-based accelerator, the energy is $430\times $ better and latency is $34\times $ better. Although training accuracy is degraded in the analog accelerator, several options to improve this are presented. The possible gains over a similar digital-only version of this accelerator block suggest that continued optimization of analog resistive memories is valuable. This detailed circuit and device analysis of a training accelerator may serve as a foundation for further architecture-level studies.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations