Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model

2012 
Abstract This paper provides an efficient approach for text-independent speaker identification using a fused Mel feature sets and Gaussian Mixture Modeling (GMM). The individual Gaussian components of a GMM are shown to represent some speaker specific spectral shapes that are effective for modeling speaker identity. Two different set of features which are complement to each, other, Mel Frequency Cepstral Coefficient (MFCC) and Inverted Mel Frequency Cepstral Coefficient (MFCC) features are obtained for each speaker and are trained using Expectation Maximization algorithm. Two GMM models; one for MFCC feature sets, other one for IMFCC feature sets are created. During testing phase, the likelihood of unknown speaker's features with each of the GMM models is determined. By using a weighted sum of these likelihood values, a fused score is created for each speaker and speaker with maximum score is the identified speaker. The performance of this fusion GMM system is evaluated using a part of the TIMIT database consisting of 120 speakers and a maximum identification efficiency of 93.88% is achieved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    24
    Citations
    NaN
    KQI
    []