Improved mandarin speech recognition by lattice rescoring with enhanced tone models

Huanliang Wang,Yao Qian,Frank K. Soong,Jian-Lai Zhou,Jiqing Han

Improved mandarin speech recognition by lattice rescoring with enhanced tone models

2006

Tone plays an important lexical role in spoken tonal languages like Mandarin Chinese. In this paper we propose a two-pass search strategy for improving tonal syllable recognition performance. In the first pass, instantaneous F0 information is employed along with corresponding cepstral information in a 2-stream HMM based decoding. The F0 stream, which incorporates both discrete voiced/unvoiced information and continuous F0 contour, is modeled with a multi-space distribution. With just the first-pass decoding, we recently reported a relative improvement of 24% reduction of tonal syllable recognition errors on a Mandarin Chinese database [5]. In the second pass, F0 information over a horizontal, longer time span is used to build explicit tone models for rescoring the lattice generated in the first pass. Experimental results on the same Mandarin database show that an additional 8% relative error reduction of tonal syllable recognition is obtained by the second-pass search, lattice rescoring with enhanced tone models.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations