Dynamic-static unsupervised sequentiality, statistical subunits and lexicon for sign language recognition

Stavros Theodorakis,Vassilis Pitsikalis,Petros Maragos

Dynamic-static unsupervised sequentiality, statistical subunits and lexicon for sign language recognition

2014

We introduce a new computational phonetic modeling framework for sign language (SL) recognition. This is based on dynamic-static statistical subunits and provides sequentiality in an unsupervised manner, without prior linguistic information. Subunit ''sequentiality'' refers to the decomposition of signs into two types of parts, varying and non-varying, that are sequentially stacked across time. Our approach is inspired by the Movement-Hold SL linguistic model that refers to such sequences. First, we segment signs into intra-sign primitives, and classify each segment as dynamic or static, i.e., movements and non-movements. These segments are then clustered appropriately to construct a set of dynamic and static subunits. The dynamic/static discrimination allows us employing different visual features for clustering the dynamic or static segments. Sequences of the generated subunits are used as sign pronunciations in a data-driven lexicon. Based on this lexicon and the corresponding segmentation, each subunit is statistically represented and trained on multimodal sign data as a hidden Markov model. In the proposed approach, dynamic/static sequentiality is incorporated in an unsupervised manner. Further, handshape information is integrated in a parallel hidden Markov modeling scheme. The novel sign language modeling scheme is evaluated in recognition experiments on data from three corpora and two sign languages: Boston University American SL which is employed pre-segmented at the sign-level, Greek SL Lemmas, and American SL Large Vocabulary Dictionary, including both signer dependent and unseen signers' testing. Results show consistent improvements when compared with other approaches, demonstrating the importance of dynamic/static structure in sub-sign phonetic modeling.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations