ProgressiveNN: Achieving Computational Scalability without Network Alteration by MSB-first Accumulative Computation

Junnosuke Suzuki,Kota Ando,Kazutoshi Hirose,Kazushi Kawamura,Thiem Van Chu,Masato Motomura,Jaehoon yu

ProgressiveNN: Achieving Computational Scalability without Network Alteration by MSB-first Accumulative Computation

2020

Computational scalability allows neural networks on embedded systems to provide desirable inference performance while satisfying severe constraints of power consumption and computing resources. This paper proposes a simple but scalable inference method called progressiveNN that consists of bitwise binary (BWB) quantization, accumulative bit-serial (ABS) inference, and batch normalization (BN) retraining. The progressiveNN does not require any network modification and obtains the network parameters from a single training. BWB quantization decomposes and transforms each parameter into a bitwise format for ABS inference. ABS inference utilizes the parameters in the MSB-first order, which enables progressive inference. The evaluation result shows that the proposed method provides computational scalability from 12.5% to 100% for ResNet18 on CIFAR-10/100 with a single set of network parameters. It also shows that BN retraining suppresses accuracy degradation at low computation cost and restores the inference accuracy to 65% at 1-bit width inference.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations