3D Human Pose Estimation: Using Context Information in Monocular Video

2021 
We propose a context-based two-stage 3D human pose estimation network structure. The first stage is to obtain the 2D human pose and 2D key-points in the video stream data, this stage is crucial to the subsequent work and the entire process. By analyzing the limitations and shortcomings of existing models, we proposed a context-based human pose estimation network structure, and incorporate the BILSTM structure into the pose machine method. In our model, Invisible key-points can be jointly predicted by human pose in current frame and context information. Through quantification and visualization experiments, we have proved that it has a good mitigating effect on the invisible key points caused by occlusion and the wrong linking of human key-points. In the second stage, the 3D human pose is obtained through sparse representation and 3D reconstruction. The experimental results show that the method we designed has higher accuracy than the existing human body pose estimation method of video streaming, and has better performance in the occlusion problem.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    0
    Citations
    NaN
    KQI
    []