Multi-scale kernels for short utterance speaker recognition

Wei-Qiang Zhang,Junhong Zhao,Wen-Lin Zhang,Jia Liu

Multi-scale kernels for short utterance speaker recognition

2014

Wei-Qiang Zhang
Junhong Zhao
Wen-Lin Zhang
Jia Liu

Short utterance is a great challenge for speaker recognition, for there is very limited data can be used for training and testing. To give a robust estimation, the amount of model parameters for the short utterance should be less than that for the long utterance; however, this may impede the models descriptive capability. In this paper, we propose a multi-scale kernel (MSK) approach to solve this problem. We construct a series of kernels with different scales, and combine them through multiple kernel learning (MKL) optimization. In this way, the robustness and scalability of the model will be both enhanced. The experimental results on NIST SRE 2010 10sec- 10sec dataset show that the proposed MSK method outperforms the traditional Gaussian mixture model supervector (GSV) followed by support vector machine (SVM) method.

Keywords:

Speech recognition
Speaker diarisation
Mixture model
NIST
Speaker recognition
Support vector machine
Robustness (computer science)
Computer science
Utterance
Machine learning
Multiple kernel learning
Pattern recognition
Artificial intelligence
Kernel (linear algebra)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations