Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge

2020 
Abstract Automatic Speaker Verification (ASV) is authentication of individuals by analyzing their speech signals. Different synthetic approaches allow spoofing to deceive ASV systems (ASVs), whether using techniques to imitate a voice or reconstruct the features. Attackers beat up the ASVs using four general techniques; impersonation, speech synthesis, voice conversion, and replay. The last technique is considered as a common and high potential tool for spoofing purposes since replay attacks are more accessible and require no technical knowledge of adversaries. In this study, we introduce a novel replay spoofing countermeasure for ASVs. Accordingly, we use the Constant Q Cepstral Coefficient (CQCC) features fed into an autoencoder to attain more informative features and to consider the noise information of spoofed utterances for discrimination purpose. Finally, different configurations of the Siamese network are used for the first time in this context for classification. The experiments performed on ASVspoof challenge 2019 dataset using Equal Error Rate (EER) and Tandem Detection Cost Function (t-DCF) as evaluation metrics show that the proposed system improved the results over the baseline by 10.73% and 0.2344 in terms of EER and t-DCF, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    9
    Citations
    NaN
    KQI
    []