Uniform-Precision Neural Network Quantization via Neural Channel Expansion

Seongmin Park,Beomseok Kwon,Kyuyoung Sim,Jieun Lim,Tae-Ho Kim,Jungwook Choi

Uniform-Precision Neural Network Quantization via Neural Channel Expansion

2021

Uniform-precision neural network quantization has gained popularity thanks to its simple arithmetic unit densely packed for high computing capability. However, it ignores heterogeneous sensitivity to the impact of quantization across the layers, resulting in sub-optimal inference accuracy. This work proposes a novel approach to adjust the network structure to alleviate the impact of uniform-precision quantization. The proposed neural architecture search selectively expands channels for the quantization sensitive layers while satisfying hardware constraints (e.g., FLOPs). We provide substantial insights and empirical evidence that the proposed search method called neural channel expansion can adapt several popular networks' channels to achieve superior 2-bit quantization accuracy on CIFAR10 and ImageNet. In particular, we demonstrate the best-to-date Top-1/Top-5 accuracy for 2-bit ResNet50 with smaller FLOPs and the parameter size.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations