Fusing Verbal and Nonverbal Information for Extractive Meeting Summarization

2018 
Automatic meeting summarization would reduce the cost of producing minutes during or after a meeting. With the goal of establishing a method for extractive meeting summarization, we propose a multimodal fusion model that identifies the important utterances that should be included in meeting extracts of group discussions. The proposed multimodal model fuses audio, visual, motion, and linguistic unimodal models that are trained by employing a convolutional neural network approach. The performance of the verbal and nonverbal fusion model presented an F-measure of 0.831. We also discuss the characteristics of verbal and nonverbal models and demonstrate that they complement each other.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    8
    Citations
    NaN
    KQI
    []