Enhanced voice activity detection using acoustic event detection and classification

2011 
We examine user-friendly voice interface that requires the hands-free speech acquisition in the continuously listening environment. The traditional voice activity detection (VAD) algorithms cannot successfully identify potential acoustic event sounds from speech. This makes the speech recognition system frequently or incorrectly activated. In this paper, we propose a novel voice activity detection technique that consists of two major modules: 1) classification and 2) detection module. In the classification module, we label the successive audio segments based on the training models. Then, in the detection module, we remove the acoustic event sounds and make decision of the explicit utterance boundary from the input audio stream. As a result, the proposed technique enables the efficient operation of speech recognition in the continuously listening environment without any touch and/or key input. Experiments in a real-world environment and performance comparison with state-of-the-art techniques are conducted to demonstrate the effectiveness of the proposed technique.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    34
    Citations
    NaN
    KQI
    []