Multi-Dimensional Residual Dense Attention Network for Stereo Matching

2019 
Very deep convolutional neural networks (CNNs) have recently achieved great success in stereo matching. It is still highly desirable to learn a robust feature map to improve ill-posed regions, such as weakly textured regions, reflective surfaces, and repetitive patterns. Therefore, we propose an end-to-end multi-dimensional residual dense attention network (MRDA-Net) in this paper, focusing on more comprehensive pixel-wise feature extraction. Our proposed network consists of two parts: the 2D residual dense attention net for feature extraction and the 3D convolutional attention net for matching. Our designed 2D residual dense attention net uses a dense network structure to fully exploit the hierarchical features from preceding convolutional layers and uses residual network structure to fuse low-level structure information and high-level semantic information. The 2D attention module of the net aims to adaptively recalibrate channel-wise features to be more concerned about informative features. Our proposed 3D convolutional attention net further expands attention mechanism for matching. The stacked hourglass module of the net is exploited to extract multi-scale context information as well as geometry information. The novel 3D attention module of the net aggregates hierarchical sub-cost volumes adaptively instead of manually and then achieves a comprehensive recalibrated cost volume for more correct disparity computation. The experiments demonstrate that our approach achieves the state-of-the-art accuracy on Scene Flow dataset and KITTI 2012 and KITTI 2015 Stereo datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    7
    Citations
    NaN
    KQI
    []