Robust speech recognition combining cepstral and articulatory features

Zhuanling Zha,Jin Hu,Qingran Zhan,Yahui Shan,Xiang Xie,Jing Wang,Hao Bo Cheng

Robust speech recognition combining cepstral and articulatory features

2017

Zhuanling Zha
Jin Hu
Qingran Zhan
Yahui Shan
Xiang Xie
Jing Wang
Hao Bo Cheng

In this paper, a nonlinear relationship between pronunciation and auditory perception is introduced into speech recognition, and superior robustness is shown in the results. The Extreme Learning Machine mapping the relations was trained with Mocha-TIMIT database. Articulatory Features (AFs) were obtained by the network and MFCCs were fused for training acoustic model-DNN-HMM and GMM-HMM in this experiment. It has an 117.0% relative increment of WER with MFCCs-AFs-GMM-HMM while 125.6% with MFCCs-GMM-HMM And the performance of the model DNN-HMM is better than that of the model GMM-HMM, both with relative and absolute performance.

Keywords:

Speech recognition
Perception
Robustness (computer science)
Extreme learning machine
Cepstrum
Feature extraction
Computer science
Pronunciation
Hidden Markov model
auditory perception
Nonlinear system

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations