Neural Network Acceleration and Voice Recognition with a Flash-based In-Memory Computing SoC
2021
AI inference based on novel compute-in-memory devices has shown clear advantages in terms of power, speed and storage density, making it a promising candidate for IoT and edge computing applications. In this work, we demonstrate a fully integrated system-on-chip (SoC) design with embedded Flash memories as the neural network accelerator. A series of techniques from device, design and system perspectives are combined to enable efficient AI inference for resource-constrained voice recognition. 7-bit/cell storage capability and self-adaptive write of novel Flash memories are leveraged to achieve state-of-the-art overall performance. Also, model deployment techniques based on transfer learning are explored to significantly improve the accuracy loss during weight data deployment. Integrated in a compact form factor, the whole voice recognition system can achieve >10 TOPS/W energy efficiency and ∼95% accuracy for real-time keyword spotting applications.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
13
References
0
Citations
NaN
KQI