Recognition of Korean Monosyllabic Speech Using 3D Facial Motion Data

D. W. Kang,J. S. Choi,J. H. Bae,Y. H. Shin,J. H. Lee,J. B. Choi,Gye-Rae Tack

Recognition of Korean Monosyllabic Speech Using 3D Facial Motion Data

2014

The purpose of this study was to extract accurate parameters of facial movement features using 3D motion capture system. Facial movement was measured for 50 Korean monosyllable vocalizations, and parameters changed by each subject’s oral structure were presented. Fifteen subjects all in 20s of age were asked to vocalize 50 Korean vowel monosyllables twice with 18 reflective markers on their lower faces. Facial movement data were then calculated into 8 parameters (the lip width, lip height, lip depth, lip angle, lip protrusion, lip-nose length, cheek area, and the angle of temporomandibular joints) and presented as a pattern for each monosyllable vocalization. Learning and recognizing process for each monosyllable was performed through speech recognition routine with Hidden Markov Model and Viterbi algorithm. The accuracy rate of all 50 monosyllables recognition was about 93.6%. Through this result, it was suggested that the possibility of speech recognition of the Korean language through quantitative facial movements using 3D motion capture system and it is believed that this be a basic research in developing speech recognition algorithms that can perform more sophisticated speeches.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations