Frequency offset correction in single sideband speech for speaker verification

2014 
Communication system mismatch represents a major influence for loss in speaker recognition performance. While microphone and handset differences have been considered in the NIST SRE, nonlinear communication system differences, such as modulation/demodulation (Mod/DeMod) carrier drift, have yet to be considered. In this study, an algorithm for estimating and correcting Mod/DeMod frequency offsets distortion in signal sideband modulation (SSB) speech is formulated based on two processing steps. In the first step, the offset of speech can be roughly scaled to a small frequency interval, which eliminates the ambiguity caused by periodicity of the spectrum. The second step performs fine-tuning within the pre-determined interval. For the first time, a statistical framework is developed for unique interval detection, where an innovative acoustic feature is proposed to represent different offsets and state-of-the-art techniques, the total variety method and PLDA, are applied. Speaker recognition experiments on SSB speech obtained from DAPPA RATS corpus show that a significant performance improvement (up to 50% relative improvement in EER) for speaker verification in SSB speech can be obtained by the proposed estimation and compensation method. Index Terms— frequency offset, SSB, speaker verification, MFCC, i-Vector, PLDA
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    1
    Citations
    NaN
    KQI
    []