M^3VSNet: Unsupervised Multi-metric Multi-view Stereo Network

2020 
The present MVS methods with deep learning have an impressive performance than traditional MVS methods. However, the learning-based networks need lots of ground-truth 3D training data, which is not always easy to be available. To relieve the expensive costs, we propose an unsupervised normal-aided multi-metric network, named M^3VSNet, for multi-view stereo reconstruction without ground-truth 3D training data. Our network puts forward: (a) Pyramid feature aggregation to extract more contextual information; (b) Normal-depth consistency to make estimated depth maps more reasonable and precise in the real 3D world; (c) The multi-metric combination of pixel-wise and feature-wise loss function to learn the inherent constraint from the perspective of perception beyond the pixel value. The abundant experiments prove our M^3VSNet state of the arts in the DTU dataset with effective improvement. Without any finetuning, M^3VSNet ranks 1st among all unsupervised MVS network on the leaderboard of Tanks & Temples datasets until April 17, 2020. Our codebase is available at this https URL.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    6
    Citations
    NaN
    KQI
    []