Neural Network Acceleration and Voice Recognition with a Flash-based In-Memory Computing SoC

Liang Zhao,Shifan Gao,Shengbo Zhang,Xiang Qiu,Fan Yang,Jie Li,Zezhi Chen,Yi Zhao

Neural Network Acceleration and Voice Recognition with a Flash-based In-Memory Computing SoC

2021

AI inference based on novel compute-in-memory devices has shown clear advantages in terms of power, speed and storage density, making it a promising candidate for IoT and edge computing applications. In this work, we demonstrate a fully integrated system-on-chip (SoC) design with embedded Flash memories as the neural network accelerator. A series of techniques from device, design and system perspectives are combined to enable efficient AI inference for resource-constrained voice recognition. 7-bit/cell storage capability and self-adaptive write of novel Flash memories are leveraged to achieve state-of-the-art overall performance. Also, model deployment techniques based on transfer learning are explored to significantly improve the accuracy loss during weight data deployment. Integrated in a compact form factor, the whole voice recognition system can achieve >10 TOPS/W energy efficiency and ∼95% accuracy for real-time keyword spotting applications.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations