Analysis of Emotion Recognition from Cross-lingual Speech: Arabic, English, and Urdu

2021 
In a system which involves interaction be- tween machines and humans, the recognition of emotion from audio has always been a focus of research. Emotion recognition can play an essential role in many fields, such as medicine, law, psychology, and customer services. In this paper, we present an empirical comparative analysis of several machine learning classifiers for emotion recognition in audio data. Evaluations are performed for a set of predefined emotions such as happy, sad, and angry from Arabic, English, and Urdu languages. Pitch and cepstral features are extracted from audio files and principal component analysis is applied for dimensionality reduction. Experiments show that random forest outperformed other classifiers on Urdu dataset with an accuracy of 78.75%. However, the performance of Meta iterative classifier on Arabic dataset was better than random forest and neural network with the accuracy of 70%. Classification of emotions on the English dataset, which do not differ much in terms of pitch and MFCC features, generated the lowest accuracies at or below 31%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []