A streaming architecture for Convolutional Neural Networks based on layer operations chaining

Moisés Arredondo-Velázquez,Javier Díaz-Carmona,Cesar Torres-Huitzil,Alfredo Padilla Medina,Juan Prado-Olivarez

A streaming architecture for Convolutional Neural Networks based on layer operations chaining

2020

Convolutional Neural Networks (CNN) have become one of the best algorithms in machine learning for content classification of digital images. The CNN computational complexity is much larger than traditional algorithms, that is why the use of Graphical Processor Units (GPU) and online servers to achieve operations acceleration is a common solution. However, there is a growing demand for real-time processing solutions in the object recognition field mainly implemented on embedded systems, which are limited both in resources and energy consumption. Recently, reported works are focused on minimizing the required resources through two design strategies. The first one is by implementing one accelerator that can be adapted to the operations of the whole CNN. The CNN architecture proposals with one accelerator for each convolution layer belong to the second design strategy, where higher performance is achieved in multiple image processing. A new design strategy is proposed in this paper, which is based on multiple accelerators using a layer operation chaining scheme for computing in parallel the operations corresponding to multiple CNN layers. Three types of parallel data processing are adopted in the proposed architecture, where the parallelism level for convolution layers is determined by defined cost-function-based algorithms. The proposed design strategy is shown by implementing three naive CNNs on a De2i-150 board, in which a peak acceleration of 18.04x was achieved in contrast with state-of-the-art design methods without layer operation chaining. Furthermore, the design results of one modified Alexnet CNN were obtained. According to the obtained results, the proposed design strategy allows to achieve a smaller processing time than that obtained by reported works using the other two design strategies. In addition, a competitive result in resources utilization is obtained for naive CNNs.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations