Multistream robust speaker recognition based on speech intelligibility

2011 
Delimiting the most informative voice segments of an acoustic signal is often a crucial initial step for any speech processing system. In the current work, we propose a novel segmentation approach based on a perception-based measure of speech intelligibility. Unlike segmentation approaches based on various forms of voice-activity detection (VAD), the proposed segmentation approach exploits higher-level perceptual information about the signal intelligibility levels. This classification based on intelligibility estimates is integrated into a novel multistream framework for automatic speaker recognition task. The multistream system processes the input acoustic signal along multiple independent streams reflecting various levels of intelligibility and then fusing the decision scores from the multiple steams according to their intelligibility contribution. Our results show that the proposed multistream system achieves significant improvements both in clean and noisy conditions when compared with a baseline and a state-of-the-art voice-activity detection algorithm.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    1
    Citations
    NaN
    KQI
    []