DNN Dataflow Choice Is Overrated.

Xuan Yang,Mingyu Gao,Jing Pu,Ankita Nayak,Qiaoyi Liu,Steven Bell,Jeff Setter,Kaidi Cao,Heonjae Ha,Christos Kozyrakis,Mark Horowitz

DNN Dataflow Choice Is Overrated.

2018

Many DNN accelerators have been proposed and built using different microarchitectures and program mappings. To fairly compare these different approaches, we modified the Halide compiler to produce hardware as well as CPU and GPU code, and show that Halide's existing scheduling language has enough power to represent all existing dense DNN accelerators. Using this system we can show that the specific dataflow chosen for the accelerator is not critical to achieve good efficiency: many different dataflows yield similar energy efficiency with good performance. However, finding the best blocking and resource allocation is critical, and we achieve a 2.6X energy savings over Eyeriss system by reducing the size of the local register file. Adding an additional level in the memory hierarchy saves an additional 25%. Based on these observations, we develop an optimizer that automatically finds the optimal blocking and storage hierarchy. Compared with Eyeriss system, it achieves up to 4.2X energy improvement for Convolutional Neural Networks (CNNs), 1.6X and 1.8X improvement for Long Short-Term Memories (LSTMs) and multi-layer perceptrons (MLPs) respectively.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations