Enhancing Speaker Discrimination at the Feature Level

2007 
This chapter describes a method for enhancing the differences between speaker classes at the feature level (feature enhancement) in an automatic speaker recognition system. The original Mel-frequency cepstral coefficient (MFCC) space is projected onto a new feature space by a neural network trained on a subset of speakers which is representative for the whole target population. The new feature space better discriminates between the target classes (speakers) than the original feature space. The chapter focuses on the method for selecting a representative subset of speakers, comparing several approaches to speaker selection. The effect of feature enhancement is tested both for clean and various noisy speech types to evaluate its applicability under practical conditions. It is shown that the proposed method leads to a substantial improvement in speaker recognition performance. The method can also be applied to other automatic speaker classification tasks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    3
    Citations
    NaN
    KQI
    []