MOS Naturalness and the Quest for Human-Like Speech

Sajad Shirali-Shahreza,Gerald Penn

MOS Naturalness and the Quest for Human-Like Speech

2018

Sajad Shirali-Shahreza
Gerald Penn

This paper reconsiders the use of MOS naturalness as an instrument for measuring the quality (vs. intelligibility) of speech. We reconsider an earlier proposed alternative, the paired comparison or “AB” test, and present new empirical evidence that this is indeed a better method for evaluating TTS quality. Using this, we evaluate three older TTS systems along with a recent deep-learning approach against native North-American and Indian speech and show that, in fact, TTS had already crossed the threshold of human-like speech synthesis some time ago. This suggests that a systematic reappraisal of the concept of abstract “naturalness” of speech is in order.

Keywords:

Computer science
Speech recognition
Naturalness
Speech coding
Speech synthesis
Intelligibility (communication)
Empirical evidence
Data modeling
paired comparison

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations