Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification

2018 
This work explores the use of phoneme level information in cohort selection to improve the performance of a speaker verification system. In speaker verification, cohort is used in score normalization to get a better performance. Score normalization is a technique to reduce the undesirable variation arising from acoustically mismatched conditions. Proper selection of cohort significantly improves speaker verification performance. In this paper, we investigate cohort selection based on a speaker model cluster under the i-vector framework that we call the i-vector model cluster (IMC). Two approaches for cohort selection are proposed. First approach utilizes speaker specific properties and called speaker specific cohort selection (SSCS). In this approach, speaker level information is used for cohort selection. The second approach is phoneme specific cohort selection (PSCS). This method improves cohort set selection by using phoneme level information. Phoneme level information is further employed in a late fusion approach that uses a majority voting method on normalized scores to improve the performance of the speaker verification system. Speaker verification experiments were conducted using the TIMIT, HINDI and YOHO databases. An equal error rate improvement of 19.01%, 14.61% and 19.4%is obtained for the proposed method compared to the standard ZT-Norm method for TIMIT, HINDI and YOHO datasets. Reasonable improvements in performance are also obtained in terms of minimum decision cost function (min DCF) and detection error trade-off (DET) curves.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    2
    Citations
    NaN
    KQI
    []