FBR-CNN: A Feedback Recurrent Network for Video Saliency Detection

2021 
Different from the saliency detection on static images, the context and dynamic information from video sequences play an important role in saliency prediction on dynamic scenes. In this work, we propose a novel feedback recurrent network (FBR-CNN) to simultaneously learn the abundant contextual and dynamic features for video saliency detection. In order to learn the dynamic relationship from video frames, we incorporate the recurrent convolutional layers into the standard feed-forward CNN model. With multiple video frames as inputs, the long-term dependence and contextual relevance over time could be strengthen due to the powerful recurrent units. Unlike the feed-forward only CNN models, we propose to feed back the learned CNN features from high-level feedback recurrent blocks (FBR-block) to low-level layers to further enhance the the contextual representations. Experiments on the public video saliency benchmarks demonstrate that the model with feedback connections and recurrent units can dramatically improve the performance of the baseline feedforward structure. Moreover, although the proposed model has few parameters (~6.5 MB), it achieves comparable performance against the existing video saliency approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []