Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems

2008 
Although hidden Markov model based speech synthesis has been proved to have good performance, there are still some factors which degrade the quality of synthesized speech: vocoder, model accuracy and over-smoothing. This paper analyzes these factors separately. Modifications for removing different factors are proposed. Experimental results show that over-smoothing in frequency domain mainly affect the quality of synthesized speech whereas over-smoothing in time domain can nearly be ignored. Time domain over-smoothing is generally caused by model structure accuracy problem and frequency domain over- smoothing is caused by training algorithm accuracy problem. Currently used model structure is capable of representing speech without quality degradation. ML-estimation based parameter training algorithm causes distortion of perception in speech synthesis. Modification for improving parameter training algorithm is more likely to improve the synthesizing performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    8
    Citations
    NaN
    KQI
    []