Automatic Generation of FPGA Kernels From Open Format CNN Models

2020 
The continuing exponential increase of deep learning applications like image classification or object detection requires faster and faster processing speeds while keeping the development time small. Specifically, there is a broad interest for unifying machine learning models into a universal ecosystem so that developers can benefit from framework interoperability and seamless device-specific acceleration. This is a more challenging task for FPGAs which are promising platforms but need extra effort in order to be part of this ecosystem. This work is based on an early development stage open-source project which is called HLS4ML originally created for particle physics applications via the automatic translation of neural networks on embedded Xilinx FPGAs. Our proposed solution involves a generalized optimization scheme on top of HLS4ML that automatically converts open format AI models called ONNX for cloud FPGAs. Our design also achieved in a demonstrated inference $102 \times $ over single-core CPU and $6.6 \times $ over GPU with a good tradeoff between accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    3
    Citations
    NaN
    KQI
    []