A 120 fps 1080p resolution block-based feature extraction architecture implementation for real-time action recognition

2017 
This paper introduces an efficient hardware accelerated feature extraction architecture with a high spec of 1920×1080 image resolution at 120 fps. We choose MoFREAK feature [1] to implement in our real-time action recognition system. MoFREAK is a local spatio-temporal feature, which combines the appearance and motion descriptor independently. We design a two phase architecture to balance the throughput difference between feature detection and feature description. Binary-mask image is adopted to detect feature point location efficiently. For feature description, to reduce high bandwidth requirement for spatial-temporal MoFREAK features, block-based keypoint technique is proposed to reduce bandwidth for grouped features. The synthesis result of our proposed architecture in TSMC 40nm technology works at 200 MHz with 1039K gate counts provides 1.2K block-based features per frame at 120 fps and 0.5K blockbased features at 240 fps. With binary-mask image, we reduce about 88% cycles and bandwidth of scanning image. With blockbased keypoint, we reduce about 81% of the salinet points and keypoints. The combination of binary-mask image and blockbased keypoint reduces about 81% of the feature extraction system bandwidth.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []