Fixed Point Implementation of Tiny-Yolo-v2 using OpenCL on FPGA

Deep Convolutional Neural Network (CNN) algorithm has recently gained popularity in many applications such as image classification, video analytic and object detection. Being compute-intensive and memory expensive, CNN-based algorithms are hard to be implemented on the embedded device. Although recent studies have explored the hardware implementation of CNN-based object classification models such as AlexNet and VGG, there is still a rare implementation of CNN-based object detection model on Field Programmable Gate Array (FPGA). Consequently, this study proposes the fixed-point (16-bit) implementation of CNN-based object detection model: Tiny-Yolo-v2 on Cyclone V PCIe Development Kit FPGA board using High-Level-Synthesis (HLS) tool: OpenCL. Considering FPGA resource constraints in term of computational resources, memory bandwidth, and on-chip memory, a data pre-processing approach is proposed to merge the batch normalization into convolution layer. To the best of our knowledge, this is the first implementation of Tiny-Yolo-v2 object detection algorithm on FPGA using Intel FPGA Software Development Kit (SDK) for OpenCL. Finally, the proposed implementation achieves a peak performance of 21 GOPs under 100 MHz working frequency.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader