3D Action Recognition Exploiting Hierarchical Deep Feature Fusion Model

2020 
Numerous existing handcrafted feature-based and conventional machine learning-based approaches cannot seize the intensive correlations of skeleton structure in the spatiotemporal dimension. On another hand, some modern methods exploiting Long Short Term Memory (LSTM) to learn temporal action attribute, which lack an efficient scheme of revealing high-level informative features. To handle the aforementioned issues, this research introduces a novel hierarchical deep feature fusion model for 3D skeleton-based human action recognition, in which the deep information for modeling human appearance and action dynamic is gained by Convolutional Neural Networks (CNNs). The deep features of geometrical joint distance and orientation are extracted via a multi-stream CNN architecture to uncovering the hidden correlations in both the spatial and temporal dimensions. The experimental results on the NTU RGB+D dataset demonstrates the superiority of the proposed fusion model against several recently deep learning (DL)-based action recognition approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    5
    Citations
    NaN
    KQI
    []