A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation

2019 
For deep learning based speech segregation to have translational significance as a noise-reduction tool, it must perform in a wide variety of acoustic environments. In the current study, performance was examined when target speech was subjected to interference from a single talker and room reverberation. Conditions were compared in which an algorithm was trained to remove both reverberation and interfering speech, or only interfering speech. A recurrent neural network (RNN) incorporating bidirectional long short-term memory (BLSTM) was trained to estimate the ideal ratio mask (IRM) corresponding to target speech. Substantial intelligibility improvements were found for hearing-impaired (HI) and normal-hearing (NH) listeners across a range of target-to-interferer ratios (TIRs). HI listeners performed better with reverberation removed, whereas NH listeners demonstrated no preference. Algorithm benefit averaged 56% points for the HI listeners at the least-favorable TIR, allowing these listeners to numerically exceed the performance of young NH listeners without processing. The current study highlights the difficulty associated with perceiving speech in reverberant-noisy environments, and it extends the range of environments in which deep learning based speech segregation can be effectively applied. This increasingly wide array of environments includes not only a variety of background noises and interfering speech but also room reverberation. [Work supported by NIH.]
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []