Flow driven attention network for video salient object detection

2020 
Salient object detection has been revolutionised by convolutional neural network (CNN) recently. However, it is hard to transfer the state-of-the-art still-image based saliency detectors to videos directly, owing to the neglect of temporal contexts between frames. In this study, the authors propose a flow-driven attention network (FDAN) to exploit motion information for video salient object detection. FDAN consists of an appearance feature extractor, a motion-guided attention module and a saliency map regression module. It extracts the appearance feature per frame, refines appearance feature with optical flow and infers the ultimate saliency map, respectively. Motion-guided attention module is the core of FDAN, which extracts motion information in the form of attention. This attention mechanism is a two-branch CNN, fusing optical flow and appearance features. In addition, a shortcut connection is applied to the attention multiplied feature map for noise suppression intensively. Experimental results show that the proposed method can achieve performance on par with the state-of-the-art method flow-guided recurrent neural encoder on challenging benchmarks of Densely Annotated Video Segmentation and Freiburg–Berkeley Motion Segmentation while being two times faster in detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    4
    Citations
    NaN
    KQI
    []