Review of prominent strategies for mapping CNNs onto embedded systems

2020 
Convolutional neural networks (CNN) have turned into one of the key algorithms in machine learning for content classification of digital images. Nevertheless, the CNN computational complexity is considerable larger than classic algorithms, thus, CPU- or GPU-based platforms are generally used for CNN implementations in many applications, but often do not fulfill portable requirements due to resources, energy and real-time constrains. Therefore, there is a growing interest on real time processing solutions for object recognition using CNNs mainly implemented on embedded systems, which are limited both in resources and energy consumption. An updated review of prominent reported approaches for mapping CNNs onto embedded systems is described in this paper. Two main solutions trends for reducing the hardware CNN workload are distinguished through a deduced taxonomy. One is focused on algorithm level solutions to reduce the number of multiplications and CNN coefficients. On the other hand, hardware level solutions goal is to achieve processing time, power consumption and hardware resources reduction. Two dominant hardware level design strategies are pointed out as oriented to either reducing the energy consumption and resources utilization meeting real-time requirements or increasing the throughput at the expense of resources utilization. Finally, two identified design strategies for CNN hardware accelerators are proposed as opportunity research areas.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    111
    References
    3
    Citations
    NaN
    KQI
    []