Video action classification using symmelets and deep learning

2017 
Classification of human actions is very challenging and important in many video-based applications. Two common features, i.e., the hand-crafted and the deep-learned ones are usually adopted for video representation and have been proven to be effective in many famous datasets in the literature. However, the hand-crafted feature lacks the ability to detect the discriminative and semantic features and the deep-learned one fails to outperform previous hand-crafted feature. This paper propose a novel symmelet-based classification approach to improve the accuracy of the state-of-the-art frameworks. "Symmelet" is a symmetrical pair including a SURF point and its corresponding symmetrical point in the same frame. Many symmetrical properties often exist in various video scenes. With symmelets, various redundant (or background) features can be filtered out so that action contents can be more accurately represented. The new approach takes advantages of symmelets, improved dense trajectories (IDT), and trajectory-pooled deep-convolutional descriptor (TDD) to learn useful deep features for represent video contents deeply. Performance evaluation on two challenging datasets, i.e., HMDB51 and UCF101 shows that the proposed solution is superior and can achieve quite higher recognition accuracy than other state-of-art frameworks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    1
    Citations
    NaN
    KQI
    []