Supervised and Unsupervised Speaker Adaptation in the NIST 2005 Speaker Recognition Evaluation

2006 
Starting in 2004, the annual NIST Speaker Recognition Evaluation (SRE) has added an optional unsupervised speaker adaptation track where test files are processed sequentially and one may update the target model. In this paper, various model adaptation factors are investigated for MAP adaptation using a supervised (ideal) adaptation scheme. Once the best performing model adaptation factor is found, unsupervised adaptation experiments are run using a threshold to determine when to update the target model. Three NIST training conditions, 10sec4w, 1conv4w, and 8conv4w, all with the 1conv4w test condition are used for experiments with the NIST 2005 SRE. MinDCF values for the three training conditions are reduced by 60.9% for 10sec4w, 48.3% for 1conv4w, and 33.3% for 8conv4w using the supervised adaptation compared to the baseline. For the unsupervised adaptation, minDCF values were reduced by 16.7%, 21.6%, and 20.5% for the respective training conditions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    5
    Citations
    NaN
    KQI
    []