Boda-RTC: Productive generation of portable, efficient code for convolutional neural networks on mobile computing platforms

2016 
The popularity of neural networks (NNs) spans academia [1], industry [2], and popular culture [3]. In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results [4]. The availability of hardware/software systems for efficient training and deployment of large and/or deep CNN models is critical for the continued success of the field [5] [1]. Early systems for NN computation focused on leveraging existing dense linear algebra techniques and libraries [6] [7]. Current approaches use low-level machine specific programming [8] and/or closed-source, purpose-built vendor libraries [9]. In this work, we present an open source system that, compared to existing approaches, achieves competitive computational speed while achieving significantly greater portability. We achieve this by targeting the vendor-neutral OpenCL platform [10] using a code-generation approach. We argue that our approach allows for both: (1) the rapid development of new computational kernels for existing hardware targets, and (2) the rapid tuning of existing computational kernels for new hardware targets. Results are presented for a case study of targeting the Qualcomm Snapdragon 820 mobile computing platform [11] for CNN deployment.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    3
    Citations
    NaN
    KQI
    []