Local to Global Feature Learning for Salient Object Detection

2022 
Existing works mainly focus on how to aggregate multi-level features for salient object detection, which may generate sub-optimal results due to interference with redundant details. To handle this problem, we aim to learn a local to global feature representation, so as to segment the detailed structures in a local perspective and locate the salient objects in a global perspective. In particular, we design a novel L2GF network which mainly consists of three modules, i.e., L-Net, G-Net, and F-Net. L-Net employs our enhanced auto-encoder structure to extract local contexts that provide rich boundary information of objects, which is able to learn rich local features in a certain receptive field. G-Net feeds the tokenized feature patches as input sequence, and leverages the well-known Transformer structure to extract global contexts which are helpful to derive the relationship between multiple and produce more complete salient results. F-Net is a coarse-to-fine process, which takes the features and maps of both local and global branches as inputs and calculate the final high-quality salient map. Extensive experiments on five benchmark datasets demonstrate that our L2GF network performs favorably against the state-of-the-art approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []