Joint RGB-Pose Based Human Action Recognition for Anomaly Detection Applications

2019 
Human Action Recognition (HAR) and Human Behaviour Anomaly Detection (HBAD) systems require intelligent and multimodal features extraction for classification. The RGB deep learning based methods represent the state-of-the-art for HAR and HBAD. On the other hand, human poses extracted by popular RGB-based detectors have shown promising results for posture-level HAR and HBAD. However, both modalities present limitations, e.g. the RGB-based methods are difficult to extract explainable features relevant to generalisation, especially when contextual data is dominant. Furthermore, human poses cannot model complex human actions, i.e. involving objects or with high contextual information. To overcome the above limitations, three Joint RGB-Pose based multimodal networks are proposed. Combinations of CNNs, 3DCNNs, RNNs, MLSTMs, and ResNet-152 pre-trained CNN networks are exploited. The proposed three methods for joint learning are compared with the correspondent RGB-based and Pose-based methods, in the context of HAR for HBAD applications. Experimental results are provided on the challenging datasets UCF101 and MPOSE2019, showing promising results in terms of recognition accuracy and processing time.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    2
    Citations
    NaN
    KQI
    []