A comparative analysis of classifiers in emotion recognition through acoustic features

The most popular features used in speech emotion recognition are prosody and spectral. However the performance of the system degrades substantially, when these acoustic features employed individually i.e either prosody or spectral. In this paper a feature fusion method (combination of energy,pitch prosody features and MFCC spectral features) is proposed. The fused features are classified individually using linear discriminant analysis (LDA), regularized discriminant analysis (RDA), support vector machine (SVM) and k nearest neighbour (kNN). The results are validated over Berlin and Spanish emotional speech databases. Results showed that,the performance is improved by 20 % approximately for each classifier when compared with performance of each classifier with individual features. Results also reveal that RDA is a better choice as a classifier for emotion classification because LDA suffers from singularity problem, which occurs due to high dimensional and small sample size speech samples i.e the number of available training speech samples is small compared to the dimensionality of the sample space. RDA eliminates this singularity problem by using regularization criteria and give better results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader