Investigating Domain Sensitivity of DNN Embeddings for Speaker Recognition Systems

2019 
A speaker embeddings framework achieves state-of-the-art speaker recognition performance by modeling speaker discriminant information directly using deep neural networks (DNNs). After the introduction of neural network based speaker embeddings, researchers have explored the requirements for training an effective embeddings network. However, the domain of the data used for system development should match the domain of operation for optimal performance. In this paper, we investigate the sensitivity of domain mismatch in the embeddings space. Specifically, degradation in performance is observed when back-end scoring with embeddings is performed with out-domain data. To compensate for the domain mismatch, we propose two novel deep domain adaptation techniques based on autoencoder architectures trained on embeddings in an unsupervised fashion. The results show that domain mismatch can be compensated effectively using autoencoders to adapt the out-domain data to in-domain.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    2
    Citations
    NaN
    KQI
    []