The WISTON Text to Speech System for Blizzard 2008

2008 
The WISTON system is a large corpus based TTS system with the unit selection method. The text analysis part of this system contains text pre-processing, word segmentation, POS tagging, phonetic transcription and prosody structure prediction. The prosody information (duration, F0, energy) is predicted by the CART model with the input context information. In the unit selection model, we use the mutual prosody constraint as the part of concatenation costs for the path searching while the predicted F0s, durations and energies are used to get the target costs. The spectrum smoothing method is also used for the speech generation. The final system was used to attend Blizzard evaluation for both English test and Mandarin test. Good scores were got based on this system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    7
    Citations
    NaN
    KQI
    []