Emotion recognition from speech using convolutional neural network with recurrent neural network architecture

2017 
Recognition of emotion is always a difficult problem, particularly if the recognition of emotion is done by using speech signal. Many significant research works have been done on emotion recognition using speech signal. The primary challenges of emotion recognition are choosing the emotion recognition corpora (speech database), identification of different features related to speech and an appropriate choice of a classification model. In this article we use 13 MFCC (Mel Frequency Cepstral Coefficient) with 13 velocity and 13 acceleration component as features and a CNN (Convolution Neural Network) and LSTM (Long Short Term Memory) based approach for classification. We chose Berlin Emotional Speech dataset (EmoDB) for classification purpose. We have approximately 80 percent of accuracy on test data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    31
    Citations
    NaN
    KQI
    []