Improving the detection efficiency of the VMR-WB VAD algorithm on music signals

2008 
Speech codecs are usually equipped with voice activity detection (VAD) algorithm to enable efficient coding of inactive frames and the discontinuous transmission mode (DTX). High VAD efficiency for speech in noisy environments is often traded off against its robustness for music. This is also the case of the VMR-WB codec recently standardized by 3GPP2. Its VAD fails to detect portions of some critical music samples. In this contribution we propose a method to improve the performance of the VMR-WB VAD on music signals. The idea is to measure the stability of tones in the spectral domain by means of per-tone correlation analysis. By using this approach, the music detection accuracy is increased to ∼99% and the problem of misclassification is significantly reduced. The proposed method has been implemented in the G.718 codec being currently standardized by the ITU-T.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    1
    Citations
    NaN
    KQI
    []