Improving speaker verification with figure of merit training

Xiaohan Li,Eric Chang,Bei-qian Dai

Improving speaker verification with figure of merit training

2002

Xiaohan Li
Eric Chang
Bei-qian Dai

A novel discriminative training method of Gaussian mixture model for text-independent speaker verification, Figure of Merit (FOM) training, is proposed in this paper. FOM training aims at maximizing the FOM of a ROC curve by adjusting the model parameters, rather than only approximating the underlying distribution of acoustic observations of each speaker that Maximum Likelihood Estimation does. The text-independent speaker verification experiments were conducted on the 1996 NIST Speaker Recognition Evaluation corpus. Compared with standard EM training method, FOM training provides significantly improved performance, e.g. the detection cost function (DCF) was reduced to 0.0286 from 0.0369 and to 0.0537 from 0.0826 in matched and mismatched conditions respectively.

Keywords:

NIST
Speaker recognition
Mixture model
Speech recognition
Artificial neural network
Discriminative model
Artificial intelligence
Pattern recognition
Maximum likelihood
Computer science
Figure of merit
speaker verification
training methods

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations