Building an on-chip deep learning memory hierarchy brick by brick: late breaking results

Isak Edo Vivancos,Sayeh Sharify,Milos Nikolic,Ciaran Bannon,Mostafa Mahmoud,Alberto Delmas Lascorz,Andreas Moshovos

Building an on-chip deep learning memory hierarchy brick by brick: late breaking results

2020

Isak Edo Vivancos
Sayeh Sharify
Milos Nikolic
Ciaran Bannon
Mostafa Mahmoud
Alberto Delmas Lascorz
Andreas Moshovos

Data accesses between on- and off-chip memories account for a large fraction of overall energy consumption during inference with deep learning networks. We present Boveda, a lossless on-chip memory compression technique for neural networks operating on fixed-point values. Boveda reduces the datawidth used per block of values to be only as long as necessary: since most values are of small magnitude Boveda drastically reduces their footprint. Boveda can be used to increase the effective on-chip capacity, to reduce off-chip traffic, or to reduce the on-chip memory capacity needed to achieve a performance/energy target. Boveda reduces total model footprint to 53%.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations