A frequency-domain convolutional neural network architecture based on the frequency-domain randomized offset rectified linear unit and frequency-domain chunk max pooling method

2020 
It is of great importance to construct a convolutional neural network architecture in the frequency domain to explore the theory of deep learning in the frequency domain. However, due to the complexity of the construction mechanism of the forward and backward pipelines needed to train the convolutional neural network in the frequency domain, higher requirements are put forward for the representation strategy of the frequency-domain activation function and pooling method in the forward and backward pipelines. Therefore, to construct a full frequency-domain convolutional neural network architecture, it is necessary to construct a frequency-domain representation strategy with a high classification accuracy and excellent time performance. In this paper, based on a chunk decomposition mechanism and the construction principle of the frequency-domain unsaturated activation function, a frequency-domain convolutional neural network architecture is proposed. Two important representation strategies are introduced into the frequency-domain forward/backward pipeline: a frequency-domain randomized offset rectified linear unit and a frequency-domain chunk max pooling method. The former can alleviate the vanishing and exploding gradient phenomena in the frequency-domain forward/backward pipeline and ensure the convergence of the convolutional neural network architecture in the frequency-domain training stage; the latter can capture the partial location information and characteristic strength of the frequency-domain neurons and improve the classification performance of the convolutional neural network in the frequency domain. This full frequency-domain convolutional neural network architecture improves the training accuracy of the convolutional neural network in the frequency-domain pipeline. The results show that on the basis of ResNet-50 as the backbone framework, an NVIDIA GeForce CUDA(Compute Unified Device Architecture) as the training pipeline, and $4\times 4$ as the activation block size of the third-level output neuron’s characteristic parameter matrix, the convolutional neural network architecture proposed in this paper can lower the top-1 error from 24.90% to 17.95%, the top-5 error from 12.85% to 9.23%. Furthermore, when the batch size is equal to 128 (in the worst-case bandwidth usage scenario), the acceleration ratio of the proposed architecture can still reach 13.0375 by selecting cuDNN as the reference model. Under the same backbone framework, the proposed architecture is tested on the MetData-1 dataset, and the classification accuracy can reach the maximum value; that is, the average difference is merely 0.18. This finding shows that the proposed architecture can improve the accuracy of the deep learning-based frequency-domain convolutional neural network model without reducing the time performance and expand the frequency-domain representation strategy of the frequency-domain activation function and pooling method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    2
    Citations
    NaN
    KQI
    []