Exploiting objective text description of images for visual sentiment analysis

2020 
This paper addresses the problem of Visual Sentiment Analysis focusing on the estimation of the polarity of the sentiment evoked by an image. Starting from an embedding approach which exploits both visual and textual features, we attempt to boost the contribution of each input view. We propose to extract and employ an Objective Text description of images rather than the classic Subjective Text provided by the users (i.e., title, tags and image description) which is extensively exploited in the state of the art to infer the sentiment associated to social images. Objective Text is obtained from the visual content of the images through recent deep learning architectures which are used to classify object, scene and to perform image captioning. Objective Text features are then combined with visual features in an embedding space obtained with Canonical Correlation Analysis. The sentiment polarity is then inferred by a supervised Support Vector Machine. During the evaluation, we compared an extensive number of text and visual features combinations and baselines obtained by considering the state of the art methods. Experiments performed on a representative dataset of 47235 labelled samples demonstrate that the exploitation of Objective Text helps to outperform state-of-the-art for sentiment polarity estimation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    54
    References
    3
    Citations
    NaN
    KQI
    []