End-to-End Speech Emotion Recognition Based on One-Dimensional Convolutional Neural Network

2019 
Real-time speech emotion recognition has always been a problem. To this end, we proposed an end-to-end speech emotion recognition model based on one-dimensional convolutional neural network, which contains only three convolution layers, two pooling layers and one full-connected layer. Through Adam optimization algorithm and back propagation mechanism, more discriminative features can be extracted continuously. Our model is quite simple in structure and easy to quickly complete the emotional classification task. Compared with traditional methods, there is no need to carry out the complex process of manually extracting features, and the model can automatically learn the emotional features from raw speech signals. In the emotional recognition experiments with EMODB, CASIA, IEMOCAP, and CHEAVD four speech databases, relatively high recognition rates were obtained. Experiments show that the proposed algorithm is of great benefit to the implementation of real-time speech emotion recognition.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    5
    Citations
    NaN
    KQI
    []