Deep Moving Poselets for Video Based Action Recognition

2017 
We propose a new approach to action classification in video, which uses deep appearance and motion features extracted from spatio-temporal volumes defined along body part trajectories to learn mid-level classifiers called deep moving poselets. A deep moving poselet is a classifier that captures a characteristic body part configuration, with a specific appearance and undergoing a specific movement. By having this mid-level representation of a body part be shared across action classes and by learning it jointly with action classifiers, we obtain a representation that is interpretable, shared and discriminative. In addition, by using sparsity-inducing norms to regularize action classifiers, we can reduce the number of deep moving poselets used by each class without hurting performance. Experiments show that the proposed method achieves state-of-the-art performance on the popular and challenging sub-JHMDB and MSR Daily Activity datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    12
    Citations
    NaN
    KQI
    []