Visual sentiment analysis based on image caption and adjective–noun–pair description

2021 
Visual sentiment analysis which intends to predict the sentiment orientation of images and videos mainly focuses on building a mapping between visual content and sentiment results. Instead of using visual analysis models to bridge the semantic gap directly, a novel method for image sentiment prediction is proposed to translate images into textual description, and analyze visual sentiment by means of textual sentiment analysis indirectly. First, a deep learning based image caption framework consisting of a deep residual network and a long and short-term memory network is utilized to generate the initial textual description of images. Considering the fact that the initial textual description lacks vocabulary to express sentiment, adjective–noun–pair description for images is introduced through SentiBank, which is an image sentiment analysis library containing a large number of adjective-noun-pair concept detectors. Four sets of adjective noun pairs with the largest response values are added to the textual description obtained by images caption model to generate the final textual description of images. Then, the text used to describe visual content is processed by means of deleting redundant symbols, retaining part of the vocabulary and embedding word vectors. Finally, the word vectors are fed into a time series network to train a sentiment prediction model. The experimental results on the Twitter dataset show that this method improves the performance of image sentiment prediction compared with other methods. The accuracy rate has been increased by 2% and the precision rate has been increased by 3%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []