Layer-by-layer Quantization Method for Neural Network Parameters

Jiali Ma,Zhiqiang Zhu,Leyu Dai,Songhui Guo

Layer-by-layer Quantization Method for Neural Network Parameters

2019

Jiali Ma
Zhiqiang Zhu
Leyu Dai
Songhui Guo

Limited by the storage and computing power of mobile devices, the deployment of neural networks on mobile devices is slow. Quantifying the parameters in the neural network not only reduces the storage required by the network, but also simplifies the design of the arithmetic unit. This facilitates the application of neural networks in mobile devices. This paper proposes a novel parameter quantization method, which quantifies the weight data and output data of the network layer by layer to achieve the purpose of model compression and achieve a good balance between model size and precision. The method in this paper achieves a 7.62 times model compression on the MNIST dataset with a precision loss of only 0.13%.

Keywords:

Artificial neural network
Layer by layer
Quantization (signal processing)
Computer science
Topology
Artificial intelligence
Software deployment
Mobile device
Computer engineering
Network layer
Deep learning
MNIST database

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations