Speech synthesis using articulatory-knowledge based HMM structure

2014 
In this paper, a different HMM structure is proposed to model the context-dependent spectral characteristics of a speech unit in order to improve synthetic speech fluency. Instead of using decision trees, we reduce the huge amount of context combinations based on the articulatory knowledge of phonemes. To evaluate the proposed HMM structure, three Mandarin speech synthesis systems using different HMM structures are constructed for comparisons. In these systems, prosodic parameters are generated with the same ANN module developed previously but spectral parameters are generated using HMMs. As to the synthesis of signal waveform, the same HNM (harmonic plus noise model) based synthesis module being developed previously is used. According to results of listening tests, the speech signal synthesized by using the proposed HMM structure is significantly more fluent than those synthesized by using other HMM structures. In addition, the average spectral distances measured between recorded and synthetic sentences show that the proposed HMM structure yields a smaller spectral distance as compared with other HMM structures.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    3
    Citations
    NaN
    KQI
    []