ETS-3D: An Efficient Two-Stage Framework for Stereo 3D Object Detection
2022
We propose an efficient two-stage framework for stereo 3D object detection, called ETS-3D. Contrary to many recent approaches that rely on depth maps predicted using time-consuming stereo matching models, our approach utilizes the well-designed features to generate high-quality 3D proposals in stage-1, without explicitly exploiting predicted depth map. Specifically, we leverage pixel-wise correlation to produce normalized cost volumes to weight the left image features, and fuse multi-scale weighted features to obtain the weighted and fused features for 3D proposal generation. To maintain fast computation, only the filtered positive 3D proposals are fed into the stage-2 sub-network for further proposal refinement and quality prediction. Furthermore, we reconstruct the 3D proposal features in stage-2 to make use of different feature representations, achieving more accurate detection results. The experimental results on the KITTI 3D object detection benchmark demonstrate that our method achieves state-of-the-art performance, and can run at more than 10 fps.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI