Joint Image-Text News Topic Detection and Tracking by Multimodal Topic And-Or Graph

2017 
This paper presents a novel method for automatically detecting and tracking news topics from multimodal TV news data. We propose a multimodal topic and-or graph (MT-AOG) to jointly represent textual and visual elements of news stories and their latent topic structures. An MT-AOG leverages a context-sensitive grammar that can describe the hierarchical composition of news topics by semantic elements about people involved, related places, and what happened, and model contextual relationships between elements in the hierarchy. We detect news topics through a cluster sampling process which groups stories about closely related events together. Swendsen–Wang cuts, an effective cluster sampling algorithm, is adopted for traversing the solution space and obtaining optimal clustering solutions by maximizing a Bayesian posterior probability. The detected topics are then continuously tracked and updated with incoming news streams. We generate topic trajectories to show how topics emerge, evolve, and disappear over time. The experimental results show that our method can explicitly describe the textual and visual data in news videos and produce meaningful topic trajectories. Our method also outperforms previous methods for the task of document clustering on Reuters-21578 dataset and our novel dataset, UCLA Broadcast News dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    67
    References
    22
    Citations
    NaN
    KQI
    []