Spatio-temporal SIFT and its application to human action classification

Manal Al Ghamdi,Lei Zhang,Yoshihiko Gotoh

Spatio-temporal SIFT and its application to human action classification

2012

Manal Al Ghamdi
Lei Zhang
Yoshihiko Gotoh

This paper presents a space-time extension of scale-invariant feature transform (SIFT) originally applied to the 2-dimensional (2D) volumetric images. Most of the previous extensions dealt with 3-dimensional (3D) spacial information using a combination of a 2D detector and a 3D descriptor for applications such as medical image analysis. In this work we build a spatio-temporal difference-of-Gaussian (DoG) pyramid to detect the local extrema, aiming at processing video streams. Interest points are extracted not only from the spatial plane (xy) but also from the planes along the time axis (xt and yt). The space-time extension was evaluated using the human action classification task. Experiments with the KTH and the UCF sports datasets show that the approach was able to produce results comparable to the state-of-the-arts.

Keywords:

Artificial intelligence
Gaussian pyramid
Pyramid
STREAMS
Detector
Computer vision
Computer science
Pattern recognition
Scale-invariant feature transform
feature transform
Maxima and minima
action recognition

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations