A Framework for Modeling, Optimizing, and Implementing DNNs on FPGA Using HLS

2020 
Deep Neural Networks (DNNs) are gaining importance for implementing large inference engines. A designer must consider numerous design choices, data-flow types, processing elements, memory hierarchy, and data-precision for a DNN implementation. A collaborative algorithm/model/hardware tool is needed to enable a reconfigurable, fast, and efficient DNN hardware accelerator. We propose an accelerator framework that automatically generates an optimized FPGA-based model given DNNs from standard machine learning frameworks without humans in the loop. For faster, accurate, and efficient hardware implementation, the framework employs a coarse-grained software-model to estimate the performance and hardware utilization using mathematical relations. The results are a High-Level-Synthesis (HLS) code with a set of optimization pragmas for fine-tuning to optimizes the hardware generated from the previous phase. Various hardware-accelerator architecture can be selected based on user preferences such as performance and FPGA chip, and the neural network size. The hardware implementation of various DNN models is shown to prove the proposed framework's flexibility and performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    3
    Citations
    NaN
    KQI
    []