Boosting Binary Neural Networks for FPGA

2020 
In this work, we propose an efficient method to execute neural networks on edge devices using FPGA. When executing a neural net on an edge device, quantization is performed to compress the model, which reduces the execution memory and shortens the execution time, but if the expressive power is reduced by quantization, the accuracy of the model is reduced. In this work, we propose a method to partition the network in order to execute the quantized network on FPGA as fast as possible without loss of accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []