Fine-grained Recognition of 3D Shapes Based on Multi-view Recurrent Neural Network

2020 
Multi-view convolutional neural network (MVCNN) has been proved more accurate and faster than those methods based on 3D modelling descriptors in 3D shapes recognition tasks. In this work, we improved MVCNN from three aspects. First, MVCNN was trained and verified on ModelNet dataset, in which views were rendered from 12 or 20 cameras at fixed positions. Since there is a big gap between this setting and real applications, we made another synthetic multi-view dataset MV3D which renders 100 views for each object at free positions. Second, a recurrent neural network (RNN) was adopted to fuse multi-view features instead of max-pooling operation, which is named as MVRNN. RNN has no requirement on the amount of views to fuse and won't loss appearance information. And at last, a loss function taking the discrimination of fused features into consideration was defined to train the MVRNN. The new loss function can improve the discrimination of fused features and improve the recognition accuracy. Comparison was carried out on both ModelNet dataset and MV3D dataset, and the results shown that MVRNN achieves higher accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    1
    Citations
    NaN
    KQI
    []