Computer graphics facial animation of Japanese speech using three‐dimensional dynamic visemes

2006 
Speech production makes not only acoustic signals but also visual images of the face especially for the jaw, lips, teeth, and tongue. To reproduce realistic facial images during Japanese speech, a three‐dimensional computer graphics model of Japanese visemes was made. These visemes were extracted from the video database of images around lips during speech captured by two high speed (up to 300 frames per second) cameras [M. J. Hirayama, Proc. ICPhS 2003 (2003), pp. 3157–3161]. Most of the visemes are created as static shapes. They are for five vowels, semi‐vowels, and some consonants. For explosives by labials (/p/ /b/ /n/) or tongue (/t/ /d/ /n/), dynamic information, that is, multiple shapes and timing information were assigned for each viseme. By placing these visemes onto a time axis at the timing of phonemes of a sentence, then interpolating shapes in between by using a spline interpolation technique on speech articulators’ motion graphs, computer graphics animation was made by ray‐tracing rendering. ...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []