Chinese Sign Language Recognition with Sequence to Sequence Learning

2017 
In this paper, we formulate Chinese sign language recognition (SLR) as a sequence to sequence problem and propose an encoder-decoder based framework to handle it. The proposed framework is based on the convolutional neural network (CNN) and recurrent neural network (RNN) with long short-term memory (LSTM). Specifically, CNN is adopted to extract the spatial features of input frames. Two LSTM layers are cascaded to implement the structure of encoder-decoder. The encoder-decoder can not only learn the temporal information of the input features but also can learn the context model of sign language words. We feed the images sequences captured by Microsoft Kinect2.0 into the network to build an end-to-end model. Moreover, we also set up another model by using skeletal coordinates as the input of the encoder-decoder framework. In the recognition stage, a probability combination method is proposed to fuse these two models to get the final prediction. We validate our method on the self-build dataset and the experimental results demonstrate the effectiveness of the proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    9
    Citations
    NaN
    KQI
    []