Real-time Voice Activity Detector Using Gammatone Filter and Modified Long-Term Signal Variability

2017 
In this paper, a real-time robust voice activity detector (VAD) is proposed. The proposed VAD adopts the gammatone filter and modify the existing long-term signal variability (LTSV) measure, i.e. known as GMLTSV in short. The proposed VAD is an improved version of existing VAD which used gammatone filter and entropy. The LTSV measure is modified to adapt to the amplitude envelopes extracted using gammatone filter by swapping entropy and variance used in LTSV measure to reduce noise effect in the extracted temporal envelopes and improve discriminative power of the extracted feature. The proposed algorithm also implements an adaptive threshold that is computed using a nonlinear filter to track short-term trend of the extracted feature in real-time. The proposed VAD using GMLTSV feature is tested against clean speech signals from TIMIT test corpus which are degraded at SNR ranged from -10dB to 20dB by non-stationary noise, eg. airport noise, babble noise, exhibition noise from Aurora-2 database, and stationary noise, eg. additive white Gaussian noise. Based on the evaluation, it is proven that the proposed GMLTSV-based VAD is robust in speech and non-speech detection even at low signal-to-noise ratio (SNR) and outperformed other existing voice activity detectors which are compared in the evaluation. The proposed VAD achieved satisfactory accuracy when compared to the impractical single frequency filtering based VAD while implementing real-time scheme for practical application.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []