A Compiler for Deep Neural Network Accelerators to Generate Optimized Code for a Wide Range of Data Parameters from a Hand-crafted Computation Kernel.

Eri Ogawa,Kazuaki Ishizaki,Hiroshi Inoue,Swagath Venkataramani,Jungwook Choi,Wei Wang,Vijayalakshmi Srinivasan,Moriyoshi Ohara,Kailash Gopalakrishnan

A Compiler for Deep Neural Network Accelerators to Generate Optimized Code for a Wide Range of Data Parameters from a Hand-crafted Computation Kernel.

2019

Eri Ogawa
Kazuaki Ishizaki
Hiroshi Inoue
Swagath Venkataramani
Jungwook Choi
Wei Wang
Vijayalakshmi Srinivasan
Moriyoshi Ohara
Kailash Gopalakrishnan

This paper presents the design and implementation of a compiler for a deep neural network accelerator that provides high performance and energy efficiency. The compiler allows deep learning frameworks, such as TensorFlow, to exploit the accelerator hardware by automatically creating data transfer code and outer loops around highly-tuned hand-crafted inner-loops for a wide range of neural network parameters. In other words, our compiler significantly reduces the development effort for deep learning libraries without sacrificing their performance. We have evaluated our prototype compiler to show that it can generate code for five most-critical deep learning operators with a comparative performance obtained from hand-tuned code.

Keywords:

Kernel (linear algebra)
Computation
Compiler
Parallel computing
Computer science
Artificial neural network
Artificial intelligence
Efficient energy use
Computer engineering
Deep learning
Data transmission
Exploit

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations