Predicting speech intelligibility based on the envelope power signal‐to‐noise ratio after modulation‐frequency selective processing.

2011 
A model for predicting the intelligibility of processed noisy speech is proposed. The model represents a speech‐based version of the envelope power spectrum model [Ewert and Dau, 2000] originally developed to account for modulation detection and masking data. The model estimates the ratio of speech‐to‐noise envelope power, SNRenv, at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Model predictions were compared to literature data, obtained with speech mixed with stationary speech‐shaped noise. Furthermore, the model was tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Consistent with new experimental data, the model predicted an increase in SRT as a function of the reverberation time as well as in conditions of spectral subtraction, the latter in contrast to the STI. An analysis of the model’s internal representation of the stimuli processed by spectral subtraction revealed that the ...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    5
    Citations
    NaN
    KQI
    []