Multiscale Convolutional Neural Networks for Geospatial Object Detection in VHR Satellite Images

2020 
Automatic object detection is a fundamental but challenging problem in very high-resolution (VHR) satellite images’ interpretation. Recently, context-based feature pyramid detection architecture has been proposed, which generates high-quality fusion feature levels for multiscale object detection and significantly improves the accuracy of traditional detection frameworks. However, in feature pyramid architecture, small objects are easily lost in deep feature levels, and the context cues will be weakened simultaneously. Moreover, each feature level in the pyramids is simply constructed from single-level layers of the backbone, which are not representative enough for multiscale object detection. In this letter, an end-to-end multiscale convolutional neural network (MSCNN) is proposed, which is based on a unified multiscale backbone named EssNet for extracting features of diverse-scale objects in HRS images. The proposed EssNet backbone maintains the resolution of deep feature levels, which improves the capability of multiscale objects’ feature expression. Meanwhile, a dilated bottleneck structure is introduced into the backbone, which generates high-quality semantic features and improves the prediction capability of multiscale objects. Finally, the whole network can be optimized end-to-end by minimizing a multitask loss. Experiments on publicly available NWPU VHR-10 benchmark and demonstrate that the proposed method outperforms several state-of-the-art detection approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    7
    Citations
    NaN
    KQI
    []