Corrective feedback, emphatic speech synthesis, visual-speech exaggeration, pronunciation learning

Yaohua Bu,Weijun Li,Tianyi Ma,Shengqi Chen,Jia Jia,Kun Li,Xiaobo Lu

Corrective feedback, emphatic speech synthesis, visual-speech exaggeration, pronunciation learning

2020

Yaohua Bu
Weijun Li
Tianyi Ma
Shengqi Chen
Jia Jia
Kun Li
Xiaobo Lu

To provide more discriminative feedback for the second language (L2) learners to better identify their mispronunciation, we propose a method for exaggerated visual-speech feedback in computer-assisted pronunciation training (CAPT). The speech exaggeration is realized by an emphatic speech generation neural network based on Tacotron, while the visual exaggeration is accomplished by ADC Viseme Blending, namely increasing Amplitude of movement, extending the phone's Duration and enhancing the color Contrast. User studies show that exaggerated feedback outperforms non-exaggerated version on helping learners with pronunciation identification and pronunciation improvement.

Keywords:

Computer science
Phone
Pronunciation
Corrective feedback
Artificial neural network
Viseme
Discriminative model
Speech recognition
Speech synthesis
Exaggeration

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations