Reparameterized attention for convolutional neural networks

2022 
The attention mechanism has been widely explored for neural networks as it could effectively model the interdependencies among channels, spatial positions, and frames. A neural network with attention modules has uncertainties in its parameters, but training the models deterministically hardly captures the uncertainties. Modeling the parameters’ uncertainty of the attention module could facilitate flexibly capturing the representative patterns, thus promoting the generalization of the models. In this work, we propose a novel reparameterized attention strategy by modeling the uncertainty of the parameters in the attention module and performing uncertainty-aware optimization. Instead of learning deterministic parameters for the attention modules, our strategy learns variational posterior distributions. The experimental results show that our strategy could consistently improve different models’ accuracy and reduce the generalization gap without extra computation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []