Rethinking the Effectiveness of Selective Attention in Neural Networks

2021 
The introduction of the attention mechanism is considered as an important innovation in the recent development of neural networks. Specifically, in convolutional neural networks (CNN), selective attention is utilized in a channel-wise manner to recalibrate the output feature map based on different inputs dynamically. However, extra attentions and multi-branch calculations introduce additional computational cost and reduce model parallel efficiency. Therefore, we rethink the effectiveness of selective attention in the network and find that the bypass branch computation is redundant and unnecessary. Meanwhile, we establish an equivalent relationship between Squeeze-and-Excitation Networks (SENet) and Selective Kernel Networks (SKNet), which are two representative network architectures with the feature attention mechanism. In this paper, we develop a new network architecture variant called Elastic-SKNet by reducing the calculation in the bypass branch of SKNet. Furthermore, we utilize the differentiable Neural Architecture Search (NAS) method to quickly search for the reduction ratio in each layer to further improve model performance. In the extensive experiments on ImageNet and MS-COCO datasets, we empirically show that the proposed Elastic-SKNet outperforms existing state-of-the-art network architectures in image classification and object detection tasks with lower model complexity, which demonstrates its further application prospects in other fields.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []