Context-Aware Emotion Recognition Based on Visual Relationship Detection

2021 
Emotion recognition, which is a part of affective computing, draws a lot of attention from researchers because of its broad applications. Unlike previous approaches with the aim to recognize humans’ emotional state using facial expression, speech or gesture, some researchers see the potential of the contextual information from the scene. Hence, in addition to the employment of the main subject, the general background data is also considered as the complementary cues for emotion prediction. However, most of the existing works still have some limitations in deeply exploring the scene-level context. In this paper, to fully exploit the essences of context, we propose the emotional state prediction method based on visual relationship detection between the main target and the adjacent objects from the background. Specifically, we utilize both the spatial and semantic features of objects in the scene to calculate the influences of all context-related elements and their properties of impact (positive, negative, or neutral) on the main subject by a modified attention mechanism. After that, the model incorporates those features with scene context and body features of the target person to predict their emotional states. Our experimental results achieve state-of-the-art performance on the CAER-S dataset and competitive results on the EMOTIC benchmark.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    1
    Citations
    NaN
    KQI
    []