Self-attention Network with Joint Loss for Remote Sensing Image Scene Classification

2020 
Deep learning, especially convolutional neural network (CNN), has been widely applied in remote sensing scene classification in recent years. However, such approaches tend to extract features from the whole image rather than discriminative regions. This article proposes a self-attention network model with joint loss that effectively reduce the interference of complex backgrounds and the impact of intra-class diversity in remote sensing image scene classification. In this model, self-attention mechanism is integrated into ResNet18 to extract discriminative features of images, which makes CNN focus on the most salient region of each image and suppress the interference of the irrelevant information. Moreover, for reducing the influence of intra-class diversity on scene classification, a joint loss function combining center loss with cross-entropy loss is proposed to further improve the accuracy of complex scene classification. Experiments carried out on AID, NWPU-NESISC45 and UC Merced datasets show that the overall accuracy of the proposed model is higher than that of most current competitive remote sensing image scene classification methods. It also performs well in the case of fewer data samples or complex backgrounds.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    56
    References
    2
    Citations
    NaN
    KQI
    []