Human Activities Recognition from Videos Based on Compound Deep Neural Network

2021 
The two-stream deep neural networks learn contour evolution information of human activities from image frames and motion information from information evolution between image frames. Optical flow image is one of the commonly-used methods to depict the motion information between frames. The existing optical flow images are usually obtained by off-line calculations. In this paper, a deep neural network named FlowNet2.0 is used to generate optical flow images, and based on which a compound deep neural network is proposed to merge spatial and temporal information to complete the recognition of human activities from videos. Specifically, the proposed compound deep neural network is composed of two sub-networks, static-data-stream learning network and dynamic-data-stream learning network. The former is applied to extract spatial evolution information from RGB images, and the latter generates optical flow prediction images from RGB image sequence first, and extracts temporal evolution information from optical flow images. Finally, the results of two sub-networks are combined under the fusion algorithm to achieve human activities recognition. Experimental results show that the compound deep neural network proposed in this paper can effectively identify human activities from video sequences, and the average classification accuracy on data set UCF101 can reach 83.35%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []