Time-varying LSTM networks for action recognition

2018 
We describe an architecture of Time-Varying Long Short-Term Memory recurrent neural networks (TV-LSTMs) for human action recognition. The main innovation of this architecture is the use of hybrid weights, shared weights and non-shared weights which we refer to as varying weights. The varying weights can enhance the ability of LSTMs to represent videos and other sequential data. We evaluate TV-LSTMs on UCF-11, HMDB-51, and UCF-101 human action datasets and achieve the top-1 accuracy of 99.64%, 57.52%, and 85.06% respectively. This model performs competitively against the models that use both RGB and other features, such as optical flows, improved Dense Trajectory, etc. In this paper, we also propose and analyze the methods of selecting varying weights.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    55
    References
    7
    Citations
    NaN
    KQI
    []