Multisource learning for skeleton-based action recognition using deep LSTM and CNN

2018 
Human action recognition based on information concerning the skeleton of the human body has been widely studied because such information can simply and clearly express features related to actions, and is unaffected by physical features of the body. Therefore, this paper discusses action recognition based on three-dimensional skeletal information obtained from RGB-D videos. We propose a multisource action recognition model that combines features of the temporal and spatial domains. Our model focuses on action features from three levels: global-level, local-level, and detail-level, since different actions concern different parts of the body. For temporal features, we adopt long short-term memory to create a model that analyzes skeleton sequence. For features of the spatial domain, we analyze the effects of three features on action recognition: skeleton joint coordinate, pairwise relative position, and speed of movement. Finally, the temporal and spatial domain models are combined into a multisource model for improving the accuracy of action recognition. Experiments show that our model brings about considerable improvement in the recognition of a variety of general actions and interactive activities.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    4
    Citations
    NaN
    KQI
    []