Human Pose Estimation in Space and Time Using 3D CNN

Agne Grinciunaite,Amogh Gudi,H. Emrah Tasli,Marten den Uyl

Human Pose Estimation in Space and Time Using 3D CNN

2016

Agne Grinciunaite
Amogh Gudi
H. Emrah Tasli
Marten den Uyl

This paper explores the capabilities of convolutional neural networks to deal with a task that is easily manageable for humans: perceiving 3D pose of a human body from varying angles. However, in our approach, we are restricted to using a monocular vision system. For this purpose, we apply a convolutional neural network approach on RGB videos and extend it to three dimensional convolutions. This is done via encoding the time dimension in videos as the 3\(^\mathrm{rd}\) dimension in convolutional space, and directly regressing to human body joint positions in 3D coordinate space. This research shows the ability of such a network to achieve state-of-the-art performance on the selected Human3.6M dataset, thus demonstrating the possibility of successfully representing temporal data with an additional dimension in the convolutional operation.

Keywords:

Monocular vision
Recurrent neural network
Coordinate space
Convolutional neural network
Pose
Computer vision
RGB color model
Computer science
Temporal database
Artificial intelligence
Encoding (memory)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations