The IRISA Text-To-Speech System for the Blizzard Challenge 2017

Damien Lolive,Pierre Alain,Nelly Barbot,Jonathan Chevelu,Gwénolé Lecorvé,Claude Simon,Marie Tahon

The IRISA Text-To-Speech System for the Blizzard Challenge 2017

2017

Damien Lolive
Pierre Alain
Nelly Barbot
Jonathan Chevelu
Gwénolé Lecorvé
Claude Simon
Marie Tahon

This paper describes the implementation of the IRISA unit selection-based TTS system for our participation to the Blizzard Challenge 2017. We describe the process followed to build the voice from given data and the architecture of our system. It uses a selection cost which integrates notably a DNN-based prosodic prediction and also a specific score to deal with narrative/direct speech parts. Unit selection is based on a Viterbi-based algorithm with preselection filters used to reduce the search space. A penalty is introduced in the concatenation cost to block some concatenations based on their phonological class. Moreover, a fuzzy function is used to relax this penalty based on the concatenation quality with respect to the cost distribution. Integrating a lot of constraints, this system achieves average results compared to others.

Keywords:

Speech recognition
Fuzzy logic
Viterbi algorithm
Architecture
Concatenation
Speech synthesis
Direct speech
Computer science
cost distribution

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations