Contextually Aware Multimodal Emotion Recognition

2021 
Detection of emotion in a conversation has a lot of applications such as humanizing chatbots, understanding public opinion through social media, medical counseling, building security systems and interactive computer simulations, etc. Since humans express emotion not only from what they speak but also from their tone and facial expressions, we have used features from three modes—text, audio and video and tried out different fusion techniques to combine the models. We have proposed a new architecture specially designed for dyadic conversation where each individual is modelled using a separate network that exchanges emotion context and seems to have a conversation with the other network. We have refined this model using teacher force.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []