Development of a Vietnamese Large Vocabulary Continuous Speech Recognition System under Noisy Conditions

Quoc Bao Nguyen,Van Tuan Mai,Quang Trung Le,Ba Quyen Dam,Van Hai Do

Development of a Vietnamese Large Vocabulary Continuous Speech Recognition System under Noisy Conditions

2018

Quoc Bao Nguyen
Van Tuan Mai
Quang Trung Le
Ba Quyen Dam
Van Hai Do

In this paper, we first present our effort to collect a 500-hour corpus for Vietnamese read speech. After that, various techniques such as data augmentation, recurrent neural network language model rescoring, language model adaptation, bottleneck feature, system combination are applied to build the speech recognition system. Our final system achieves a low word error rate at 6.9% on the noisy test set.

Keywords:

Recurrent neural network
Speech recognition
Vietnamese
Language model
Word error rate
Speech corpus
Test set
Bottleneck
Computer science
Vocabulary

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations