Self-attentional Convolution for Neural Networks

2019 
Convolutional neural networks (CNNs) have proven to be effective models for tackling a variety of visual tasks. For each convolutional layer, a set of filters are learned to express local spatial connectivity patterns along inputs, but with the complexity of the structure growing, the network training becoming harder and harder, one of the reasons is over-fitting. Many methods are proposed recently in order to reduce this factor by network regularization or normalization, but few of them put the thought of structural control into consideration. In this paper, we propose a new method called Self-attentional Convolution for learning and regularizing the structure of convolutional layers by using the thoughts of attention models, which is more efficient than the original convolutional layers. For calculable and reasonable, we divide the attention weight factors into two parts, channels and shape of the kernels, as the structural constraints for each convolution layer from different views, and multiply them as the global attentional factors for the weights of the convolutional kernel. At last, several experiments are designed, and the final improvement is about 1% on 20, 32, 44, 56 and 110 layers of ResNets on CIFAR-10, and 0.8% of AlexNet on ImageNet dataset, which can prove the effectiveness of our method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []