Head pose estimation with soft labels using regularized convolutional neural network

2019 
Abstract Head pose estimation has many wide applications such as driver monitoring, attention recognition and multi-view facial analysis. Most of the previous works routinely utilize detected face regions to further estimate head pose with hard labels, which limits to explore more discriminative texture information and tends to over-fit. In this paper, we present a novel framework to alleviate this problem, which takes entire images as input and constructs soft labels using a Gaussian distribution function as supervision information, and then introduces a regularized convolutional neural network architecture that is optimized by two types of similarity measure functions: Kullback–Leibler divergence loss and Jeffreys divergence loss. The regularized architecture includes four modules: one backbone net for learning common features, two parallel branches named sub-net1 and sub-net2 for learning complementary features and one feature fusion module, namely, fused net. The architecture is trained in an alternately training fashion, making the learned model more robust and stable. Extensive experiments have been carried out on three public datasets: Pointing04, CAS-PEAL-R1 and CMU Multi-PIE. The results show that our method achieves a significant improvement in performance compared to the state of the art. The best accuracy on the three datasets we achieve are 85.77%, 99.19% and 99.88%, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    62
    References
    12
    Citations
    NaN
    KQI
    []