Chapter 4 – Video Structure Discovery Using Unsupervised Learning

2006 
This chapter deals with video structure discovery using unsupervised learning. It presents a content-adaptive analysis and representation framework for audio event discovery from unscripted multimedia. The proposed framework is based on the observation that interesting events happen sparsely in a background of usual events. Three time series for audio event discovery were used: low-level audio features, frame-level audio classification labels, and 1-second-level audio classification. An inlier/outlier-based temporal segmentation of these three time series was performed. The segmentation was based on eigenvector analysis of the affinity matrix obtained from statistical models of the subsequences of the input time series. The detected outliers were also ranked based on deviation from the background process. Experimental results on a total of 12 hours of sports audio from three different genres—soccer, baseball, and golf from Japanese, American, and Spanish broadcasts—show that unusual events can be effectively extracted from such an inlier/outlier-based segmentation resulting from the proposed framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []