Influence of Language Differences in Crowdsourcing Speech Quality Assessment Studies

2021 
The quality of the speech signal is essential as it influences the user experience of voiced interactive systems. Speech quality studies have traditionally been conducted in restricted laboratory rooms with professional audio equipment. Nowadays, crowd-sourcing represents a valid alternative for the rapid assessment of large speech databases at a fraction of the cost and time of traditional laboratory practices. However, crowd-sourcing users perform tasks in an unsupervised manner. Thus, it is challenging to control whether their skills match those of the study’s intended audience. This is important in speech quality evaluations as some listeners may end up participating in a listening test of a target language other than their mother tongue. This paper investigates the influence of assessing the quality of a German speech dataset with native English and Spanish speakers. To this end, three studies were conducted in crowdsourcing where listeners evaluated the quality of speech stimuli following the ITU-T Rec. P.808. A strong Pearson correlation and low RMSE was found between the laboratory ratings and the scores collected in all crowdsourcing studies, despite the listeners’ mother tongue. Still, a bias was seen between the mean opinion scores from the German crowd-workers and the native English and Spanish speakers. The non-German participants tended to overestimate the quality of the speech stimuli.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []