Speaker verification based on speaker background model virtually synthesized using local acoustic information

2002 
In this paper, a likelihood normalization method for verifying a speaker based on an HMM is proposed. In the proposed method, a speaker background model used in likelihood normalization is synthesized from the HMMs of neighboring speakers on the basis of local acoustic information expressed by the phonemes, states, and distributions of the HMMs. By this method, the statistical intermodel distance between the speaker and the background speaker in the acoustic space can be minimized and effective normalization capable of absorbing variations in the likelihood can be realized statistically. When likelihood normalization by a cohort speaker model generated on the basis of the distribution (the proposed method) was compared with the past method of using the cohort speaker model selected for speakers in speaker verifying tests using the telephone voices of 640 people (open tests without time differentials using 320 speakers and 320 impersonators), the equal error rate (EER) was reduced from 5.27% to 1.76% by the proposed method. In addition, when likelihood normalization combining a speaker-independent model and a cohort speaker model generated on the basis of the distributions was compared with a method using only a speaker-independent model in texts using the telephone voices of 100 people (open tests with a time differential of 3 months using 24 speakers and 75 impersonators), the EER was reduced from 3.41% to 2.51%, confirming the efficacy of the proposed method. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 2, 85(4): 47–57, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjb.10045
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    1
    Citations
    NaN
    KQI
    []