Performance evaluation of a contextual news story segmentation algorithm

2006 
The problem of semantic video structuring is vital for automated management of large video collections. The goal is to automatically extract from the raw data the inner structure of a video collection; so that a whole new range of applications to browse and search video collections can be derived out of this high-level segmentation. To reach this goal, we exploit techniques that consider the full spectrum of video content; it is fundamental to properly integrate technologies from the fields of computer vision, audio analysis, natural language processing and machine learning. In this paper, a multimodal feature vector providing a rich description of the audio, visual and text modalities is first constructed. Boosted Random Fields are then used to learn two types of relationships: between features and labels and between labels associated with various modalities for improved consistency of the results. The parameters of this enhanced model are found iteratively by using two successive stages of Boosting. We experimented using the TRECvid corpus and show results that validate the approach over existing studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []