Global Context-Based Multilevel Feature Fusion Networks for Multilabel Remote Sensing Image Scene Classification

2021 
Different from the traditional remote sensing (RS) scene classification which uses a single scene label to holistically annotate an image, multilabel RS image classification uses a series of object labels to interpret a scene more deeply. For multilabel RS scene classification, there exist two vital problems. First, the objects with different semantic labels have smaller sizes and more scattered arrangements compared to backgrounds, making meaningful semantic feature extraction and representation severely hard. Second, an RS scene usually contains various kinds of objects, leading to exponential magnification of output label space size with the increase of the number of object categories. To simultaneously solve the challenges in features as well as label space and produce significant performance improvements, this article proposes a novel end-to-end deep learning architecture, which we term the global context-based multilevel feature fusion network. We verify the whole framework by conducting a great number of experiments on two publicly available multilabel datasets, and we also provide an ablation study exploring different modules inclusion in the framework. Experimental results demonstrate that the proposed method is superior to some popular networks for multilabel RS image scene classification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    56
    References
    1
    Citations
    NaN
    KQI
    []