Data augmentation and post selection for improved replay attack detection

2019 
Vulnerabilities of the Automatic Speaker Verification (ASV) technology have been recognized and have generated much interest to design anti-spoofing detectors. Replay attacks pose a severe threat due to the relative difficulty for detection and the ease in mounting spoofing attacks. In this paper, a high performing spoofing detection countermeasure is presented. Deep Learning (DL) based speech embedding extractors and a novel data augmentation approach are combined to improve the detection performance. To select augmented samples with high quality and diversity and avoid the bias caused by human subjective perception, we propose the use of a Support Vector Machine (SVM) based post-filter. With the generated extra informative training data, problems of over-fitting and lack of generalization can be significantly alleviated. Experimental results measured by equal error rates (EERs) indicate a relative improvement of 30% on the development and evaluation subsets. This provides the motivation for the proposed audio data augmentation and also promotes the future research on generated samples selection in the application of speaker spoofing detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []