Information Bottleneck Approach to Spatial Attention Learning
2021
The selective visual attention mechanism in the human visual system (HVS)
restricts the amount of information to reach visual awareness for perceiving
natural scenes, allowing near real-time information processing with limited
computational capacity [Koch and Ullman, 1987]. This kind of selectivity acts
as an 'Information Bottleneck (IB)', which seeks a trade-off between
information compression and predictive accuracy. However, such information
constraints are rarely explored in the attention mechanism for deep neural
networks (DNNs). In this paper, we propose an IB-inspired spatial attention
module for DNN structures built for visual recognition. The module takes as
input an intermediate representation of the input image, and outputs a
variational 2D attention map that minimizes the mutual information (MI) between
the attention-modulated representation and the input, while maximizing the MI
between the attention-modulated representation and the task label. To further
restrict the information bypassed by the attention map, we quantize the
continuous attention scores to a set of learnable anchor values during
training. Extensive experiments show that the proposed IB-inspired spatial
attention mechanism can yield attention maps that neatly highlight the regions
of interest while suppressing backgrounds, and bootstrap standard DNN
structures for visual recognition tasks (e.g., image classification,
fine-grained recognition, cross-domain classification). The attention maps are
interpretable for the decision making of the DNNs as verified in the
experiments. Our code is available at https://github.com/ashleylqx/AIB.git.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI