YOLSE: Egocentric Fingertip Detection from Single RGB Images

2017 
With the development of wearable device and augmented reality (AR), the human device interaction in egocentric vision, especially the hand gesture based interaction, has attracted lots of attention among computer vision researchers. In this paper, we build a new dataset named EgoGesture and propose a heatmap-based solution for fingertip detection. Firstly, we discuss the dataset collection detail and as well the comprehensive analysis of this dataset, which shows that the dataset covers substantial data samples in various environments and dynamic hand shapes. Furthermore, we propose a heatmap-based FCN (Fully Convolution Network) named YOLSE (You Only Look what You Should See) for fingertip detection in the egocentric vision from single RGB image. The fingermap is the proposed new probabilistic representation for the multiple fingertip detection, which not only shows the location of fingertip but also indicates whether the fingertip is visible. Comparing with state-of-the-art fingertip detection algorithms, our framework performs the best with limited dependence on the hand detection result. In our experiments, we achieve the fingertip detection error at about 3.69 pixels in 640px x 480px video frame and the average forward time of the YOLSE is about 15.15 ms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    13
    Citations
    NaN
    KQI
    []