An object detector based on multiscale sliding window search using a fully pipelined binarized CNN on an FPGA

2017 
An object detection problem consists of two problems: one is classification of detected object category and the other is localization. Frame object detection is used in an embedded vision systems, such as a robot, an automobile, a security camera, and a drone. These applications require high-performance computation and low-power consumption by an inexpensive device. This paper proposes multiscale sliding window based object detector using a fully pipelined binarized deep convolutional neural network (BCNN) on an FPGA. It consists of a sliding window part, a fully pipelined BCNN classifier, and an ARM processing unit for detection. Duplicate detections were filtered by using a non-maximum suppression algorithm running on the ARM processor. We propose the fully pipelined layers for the BCNN and its architecture for FPGA realization. Since the proposed BCNN circuit uses on-chip memories on the FPGA, its throughput is higher than a GPU based one with practical recognition accuracy. We trained the VGG11 based BCNN using the KITTI vision benchmark for the car detection scenario. Then, we implemented the proposed object detector on the Xilinx Inc. Zynq UltraScale+ MPSoC zcu102 evaluation board. The GPU based object detectors were too slow for the realtime application requirement (HD frame rate), with the exception of YOLOv2. As compared with the GPU implementation of YOLOv2, the proposed FPGA detector had higher recognition accuracy and lower power consumption. Compared with the YOLOv2, the proposed FPGA one is higher with respect to recognition accuracy, and its power consumption is lower than the GPU based YOLOv2. Thus, the FPGA based object detector suitable for the embedded realtime applications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    12
    Citations
    NaN
    KQI
    []