Cross-culture Continuous Emotion Recognition with Multimodal Features
2019
Automatic emotion recognition is a challenging task that can make great impact on improving natural human-computer interactions. In this paper, we present our automatic prediction of dimensional emotional state for Cross-cultural Emotion Sub-Challenge (AVEC 2018), which uses multi-features and fusion across visual, audio and text modalities. Single-feature predictions are modeled at first with support vector regression (SVR). The multimodal fusion of these modalities is then performed with a multiple linear regression model. Besides the baseline features, we extract one-gram and two-gram features from text, and some types of convolutional neural networks (CNNs) feature from video. Our multimodal fusion reached CCC=0.599 on the development set for arousal, 0.617 for valence and 0.289 for likability.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
14
References
0
Citations
NaN
KQI