A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network

Lindong Wu,Zongwei Wang,Ming Zhao,Wei Hu,Yimao Cai,Ru Huang

A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network

2021

Speech command interaction has drawn much attention in smart application market. Many of previous chips achieve an ultra-low power consumption at the cost of a certain accuracy loss, and they are designed only for the fixed speech command recognition tasks, which is inflexible and restrains further development. Here, we demonstrate a configurable speech command recognition ASIC with an ultra-high accuracy fabricated by the TSMC commercial 180-nm CMOS technology. In this chip, Mel-Frequency Cepstrum Coefficients (MFCCs) are used as speech features and a One-Dimension Convolutional Neural Network (1-D CNN) is adopted for the speech feature recognition, which simplifies the design of network and the storage method of memory. Moreover, the configurable 1-D CNN layer of the network ensures the diversity and flexibility of the commands. The measurement results indicate that the chip achieves a 95.6% accuracy on Google Speech Command Database (GSCD) when working at 16 MHz and keeping a reasonable power consumption as 26.4 mW. Moreover, the chip supports max 30 speech commands at a time, which is better than the state-of-the-art chips.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations