Improving speaker verification with figure of merit training

2002 
A novel discriminative training method of Gaussian mixture model for text-independent speaker verification, Figure of Merit (FOM) training, is proposed in this paper. FOM training aims at maximizing the FOM of a ROC curve by adjusting the model parameters, rather than only approximating the underlying distribution of acoustic observations of each speaker that Maximum Likelihood Estimation does. The text-independent speaker verification experiments were conducted on the 1996 NIST Speaker Recognition Evaluation corpus. Compared with standard EM training method, FOM training provides significantly improved performance, e.g. the detection cost function (DCF) was reduced to 0.0286 from 0.0369 and to 0.0537 from 0.0826 in matched and mismatched conditions respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    10
    Citations
    NaN
    KQI
    []