Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh

2018 
Articulatory speech synthesis has the potential to offer more natural sounding synthetic speech than established concatenative or parametric synthesis methods. Time-domain acoustic models are particularly suited to the dynamic nature of the speech signal, and recent work has demonstrated the potential of dynamic vocal tract models that accurately reproduce the vocal tract geometry. This paper presents a dynamic 3D digital waveguide mesh (DWM) vocal tract model, capable of movement to produce diphthongs. The technique is compared to existing dynamic 2D and static 3D DWM models, for both monophthongs and diphthongs. The results indicate that the proposed model provides improved formant accuracy over existing DWM vocal tract models. Furthermore, the computational requirements of the proposed method are significantly lower than those of comparable dynamic simulation techniques. This paper represents another step toward a fully functional articulatory vocal tract model which will lead to more natural speech synthesis systems for use across society.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    4
    Citations
    NaN
    KQI
    []