Grouplet-Based Distance Metric Learning for Video Concept Detection

2012 
We investigate general concept detection in unconstrained videos. A distance metric learning algorithm is developed to use the information of the group let structure for improved detection. A group let is defined as a set of audio and/or visual code words that are grouped together according to their strong correlations in videos. By using the entire group lets as building elements, concepts can be more robustly detected than using discrete audio or visual code words. Compared with the traditional method of generating aggregated group let-based features for classification, our group let-based distance metric learning approach directly learns distances between data points, which better preserves the group let structure. Specifically, our algorithm uses an iterative quadratic programming formulation where the optimal distance metric can be effectively learned based on the large-margin nearest-neighbor setting. The framework is quite flexible, where various types of distances can be computed using individual group lets, and through the same distance metric learning algorithm the distances computed over individual group lets can be combined for final classification. We extensively evaluate our method over the large-scale Columbia Consumer Video set. Experiments demonstrate that our approach can achieve consistent and significant performance improvements.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    1
    Citations
    NaN
    KQI
    []