ProgressiveNN: Achieving Computational Scalability without Network Alteration by MSB-first Accumulative Computation
2020
Computational scalability allows neural networks on embedded systems to provide desirable inference performance while satisfying severe constraints of power consumption and computing resources. This paper proposes a simple but scalable inference method called progressiveNN that consists of bitwise binary (BWB) quantization, accumulative bit-serial (ABS) inference, and batch normalization (BN) retraining. The progressiveNN does not require any network modification and obtains the network parameters from a single training. BWB quantization decomposes and transforms each parameter into a bitwise format for ABS inference. ABS inference utilizes the parameters in the MSB-first order, which enables progressive inference. The evaluation result shows that the proposed method provides computational scalability from 12.5% to 100% for ResNet18 on CIFAR-10/100 with a single set of network parameters. It also shows that BN retraining suppresses accuracy degradation at low computation cost and restores the inference accuracy to 65% at 1-bit width inference.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
16
References
1
Citations
NaN
KQI