Deep neural network and switching Kalman filter based continuous affect recognition
2016
In this paper, we propose the deep neural network - switching Kalman filter (DNN-SKF) based frameworks for both single modal and multi-modal continuous affective dimension estimation. The DNN-SKF framework firstly models the complex nonlinear relationship between the input (audio, visual, or lexical) features and the affective dimensions via the non-recurrent DNN, then models the temporal dynamics embedded in the emotions via the segmental linear SKF. Affective dimension estimation experiments are carried out on the Audio Visual Emotion Challenge (AVEC2012) database. Single modal estimation results are compared to those from the Support Vector Regression (SVR) models and Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) models, results show that for all modalities, and for all affective dimensions except for arousal from the audio features, the DNN-SKFs outperform SVR and BLSTM-RNN models in estimating the affective dimensions. Multi-modal estimation results are compared with the state of the art results on the competition of AVEC2012. Results show that both on the development set and test set, the proposed DNN-SKF models obtain the best performance in estimating the affective dimensions. On the test set, with the audio visual features, the average Pearson correlation coefficient (COR) is improved to 0.326 from 0.226 of the linear regression method [1], while with the audio visual and lexical features, the COR is improved to 0.355 from 0.344 of the particle filter fusion method (SVR-PF) [2].
Keywords:
- Computer science
- Artificial intelligence
- Computer vision
- Kalman filter
- Support vector machine
- Recurrent neural network
- Artificial neural network
- Pattern recognition
- Linear regression
- Machine learning
- Feature extraction
- Hidden Markov model
- Test set
- Pearson product-moment correlation coefficient
- Speech recognition
- Particle filter
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
21
References
0
Citations
NaN
KQI