Exploiting prosodic information for Speaker Recognition

Yanhua Long,Bin Ma,Haizhou Li,Wu Guo,Eng Siong Chng,Li Rong Dai

Exploiting prosodic information for Speaker Recognition

2009

Yanhua Long
Bin Ma
Haizhou Li
Wu Guo
Eng Siong Chng
Li Rong Dai

In this paper, we study speaker characterization using prosodic supervectors with negative within-class covariance normalization (NWCCN) projection and speaker modeling with support vector regression (SVR). We also propose a segmental weight fusion (SWF) technique that combines acoustic and prosodic subsystems effectively, despite the big performance gap between the subsystems. We validate the effectiveness of our proposed techniques on the NIST 2006 Speaker Recognition Evaluation (SRE) in comparison with other prominent solutions. The experiments have reported competitive results of 17.72% Equal Error Rate for the prosodic subsystem alone and 4.50% for the fusion system on NIST 2006 SRE core test condition.

Keywords:

NIST
Speaker diarisation
Speech recognition
Support vector machine
Normalization (statistics)
Speaker recognition
Feature extraction
Artificial intelligence
Word error rate
Computer science
Pattern recognition
Covariance
Covariance matrix

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations