Pose Recognition Using Convolutional Neural Networks on Omni-directional Images

2017 
Abstract Convolutional neural networks (CNNs) are used frequently in several computer vision applications. In this work, we present a methodology for pose classification of binary human silhouettes using CNNs, enhanced with image features based on Zernike moments, which are modified for fisheye images. The training set consists of synthetic images that are generated from three-dimensional (3D) human models, using the calibration model of an omni-directional camera (fisheye). Testing is performed using real images, also acquired by omni-directional cameras. Here, we employ our previously proposed geodesically corrected Zernike moments (GZMI) and confirm their merit as stand-alone descriptors of calibrated fisheye images. Subsequently, we explore the efficiency of transfer learning from the previously trained model with synthetically generated silhouettes, to the problem of real pose classification, by continuing the training of the already trained network, using a few frames of annotated real silhouettes. Furthermore, we propose an enhanced architecture that combines the calculated GZMI features of each image with the features generated at CNNs’ last convolutional layer, both feeding the first hidden layer of the traditional neural network that exists at the end of the CNN. Testing is performed using synthetically generated silhouettes as well as real ones. Results show that the proposed enhancement of CNN architecture, combined with transfer learning improves pose classification accuracy for both the synthetic and the real silhouette images.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    10
    Citations
    NaN
    KQI
    []