Phone recognition for Lhasa-Tibetan based on articulatory features augmentation learning
2016
In a series of studies, articulatory features used as speech attributes for automatic speech recognition systems have been shown to improve the performance. The existing articulatory features are defined by phonetician as a set of articulatory descriptions of phones, which represent some semantic information explaining how humans produce speech sounds via the interaction of different physiological structures. But these manually specified attributes suffer from the incomplete capturing articulation information of languages and are not distinctive enough for accurate phoneme recognition. In this paper, we are solving the problem of a more complete set of articulatory features representation by sparse coding methods. For example of Lhasa-Tibetan language, we learned the latent attributes that sparsely represent more speech articulation information in Tibetan language. Models based on the concatenated semantic and latent speech attributes performed the better accuracy over the existing methods based on semantic speech attributes fused with cepstral features in our experiments for Tibetan phone recognition.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
12
References
0
Citations
NaN
KQI