Development of an approach to automatic language identification based on phone recognition

Yonghong Yan,Etienne Barnard,Ronald A. Cole

Development of an approach to automatic language identification based on phone recognition

1996

Abstract An Automatic Language Identification (LID) approach is presented. The baseline LID system consists of three parts: (1) hidden Markov model (HMM) based context-independent phone recognizers, (2) language identification score generators and (3) a linear language classifier. The system exploits language-dependent phonotactic constraints and prosodic information. Four methods are proposed to improve the system performance. Two bigram-based interpolated N-gram language models (forward and backward) are used to model the phone sequence constraints of different spoken languages. A context-dependent duration model interpolated by a context-independent duration model is used to capture the duration information. Comparison experiments between the linear classifier and neural network-based final classifiers were conducted. Finally, optimization of language model based on back propagation is proposed. The improved system was evaluated on an 11-language task, and performance reached 13·3% and 26·2% (error rate) for utterances averaging 45 s duration and 10 s duration, respectively. Compared with the baseline system performance, it shows the importance of the issues addressed in this paper for language identification.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations