ProgressiveNN: Achieving Computational Scalability without Network Alteration by MSB-first Accumulative Computation

2020 
Computational scalability allows neural networks on embedded systems to provide desirable inference performance while satisfying severe constraints of power consumption and computing resources. This paper proposes a simple but scalable inference method called progressiveNN that consists of bitwise binary (BWB) quantization, accumulative bit-serial (ABS) inference, and batch normalization (BN) retraining. The progressiveNN does not require any network modification and obtains the network parameters from a single training. BWB quantization decomposes and transforms each parameter into a bitwise format for ABS inference. ABS inference utilizes the parameters in the MSB-first order, which enables progressive inference. The evaluation result shows that the proposed method provides computational scalability from 12.5% to 100% for ResNet18 on CIFAR-10/100 with a single set of network parameters. It also shows that BN retraining suppresses accuracy degradation at low computation cost and restores the inference accuracy to 65% at 1-bit width inference.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []