Multi-Scale Spatio-Temporal Feature Extraction and Depth Estimation from Sequences by Ordinal Classification.

2020 
Depth estimation is a key problem in 3D computer vision and has a wide variety of applications. In this paper we explore whether deep learning network can predict depth map accurately by learning multi-scale spatio-temporal features from sequences and recasting the depth estimation from a regression task to an ordinal classification task. We design an encoder-decoder network with several multi-scale strategies to improve its performance and extract spatio-temporal features with ConvLSTM. The results of our experiments show that the proposed method has an improvement of almost 10% in error metrics and up to 2% in accuracy metrics. The results also tell us that extracting spatio-temporal features can dramatically improve the performance in depth estimation task. We consider to extend this work to a self-supervised manner to get rid of the dependence on large-scale labeled data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    1
    Citations
    NaN
    KQI
    []