Image-text Multimodal Emotion Classification via Multi-view Attentional Network

2020 
Compared with single-modal content, multimodal data can express users feelings and sentiments more vividly and interestingly. Therefore, multimodal sentiment analysis has become a popular research topic. However, most existing methods either learn modal sentiment feature independently, without considering their correlations, or they simply integrate multimodal features. In addition, most publicly available multimodal datasets are labeled by sentiment polarities, while the emotions expressed by users are specific. Based on this observation, in this paper, we build a large-scale image-text emotion dataset (i.e., labeled by different emotions), called TumEmo, with more than 190,000 instances from Tumblr1. We further propose a novel multimodal emotion analysis model based on the Multi-view Attentional Network (MVAN), which utilizes a memory network that is continually updated to obtain the deep semantic features of image-text. The model includes three stages: feature mapping, interactive learning, and feature fusion. In the feature mapping stage, we leverage image features from an object viewpoint and a scene viewpoint to capture effective information for multimodal emotion analysis. Then, an interactive learning mechanism is adopted that uses the memory network; this mechanism extracts single-modal emotion features and interactively models the cross-view dependencies between the image and text. In the feature fusion stage, multiple features are deeply fused using a multilayer perceptron and a stacking-pooling module. The experimental results on the MVSA-Single, MVSA-Multiple, and TumEmo datasets show that the proposed MVAN outperforms strong baseline models by large margins. Index TermsMultimodal sentiment analysis; Memory network; Multi-view attention mechanism; Social media
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    4
    Citations
    NaN
    KQI
    []