A Spatial-temporal 3D Human Pose Reconstruction Framework
3
Citation
11
Reference
10
Related Paper
Citation Trend
Abstract:
3D human pose reconstruction from single-view camera is a difficult and challenging topic. Many approaches have been proposed, but almost focusing on frame-by-frame independently while inter-frames are highly correlated in a pose sequence. In contrast, we introduce a novel spatial-temporal 3D reconstruction framework that leverages both intra and inter frame relationships in consecutive 2D pose sequences. Orthogonal Matching Pursuit (OMP) algorithm, pre-trained Pose-angle Limits and Temporal Models have been implemented. We quantitatively compare our framework versus recent works on CMU motion capture dataset and Vietnamese traditional dance sequences. Our method outperforms others with 10 percent lower of Euclidean reconstruction error and robustness against Gaussian noise. Additionally, it is also important to mention that our reconstructed 3D pose sequences are smoother and more natural than others.A robot needs to localize an unknown object before grasping it. When the robot only has a monocular sensor, how can it get the object pose? In this work, we present a method of localizing the 6-DOF pose of a target object using a robotic arm and a hand-mounted monocular camera. The method includes an object recognition and a localization process. The recognition process uses point features on a surface of the target as a model of the object. The object localization process combines the robotic motion data and image data to calculate the 6-DOF pose of the object. This method can process objects containing textured planes. We verify the method in real tests.
Monocular vision
Monocular
Cite
Citations (1)
Abstract In this paper, an object recognition method and a pose estimation approach using stereo vision is presented. The proposed approach was used for position based visual servoing of a 6 DoF manipulator. The object detection and recognition method was designed with the purpose of increasing robustness. A RGB color-based object descriptor and an online correction method is proposed for object detection and recognition. Pose was estimated by using the depth information derived from stereo vision camera and an SVD based method. Transformation between the desired pose and object pose was calculated and later used for position based visual servoing. Experiments were carried out to verify the proposed approach for object recognition. The stereo camera was also tested to see whether the depth accuracy is adequate. The proposed object recognition method is invariant to scale, orientation and lighting condition which increases the level of robustness. The accuracy of stereo vision camera can reach 1 mm. The accuracy is adequate for tasks such as grasping and manipulation.
Robustness
Visual Servoing
RGB color model
Computer stereo vision
Cite
Citations (4)
In this paper we present an algorithm for detecting objects in a sequence of color images taken from a moving camera. The first step of our algorithm is the estimation of motion in the image plane. Instead of calculating optical flow, tracking single points, edges or regions over a sequence of images, we determine the motion of clusters, built by grouping of pixels in a color/position feature space. The second step is a motion-based segmentation, where adjacent clusters with similar trajectories are combined to build object hypotheses. Our application area is vision-based driving assistance. The algorithm has been successfully tested in traffic scenes containing objects, such as cars, motorcycles, and pedestrians.
Optical Flow
Tracking (education)
Feature (linguistics)
Image plane
Sequence (biology)
Position (finance)
Match moving
Motion field
Cite
Citations (18)
The stripe laser based stereo vision is often used in robot vision-guided system in the eye-in-hand configuration. The 3D scene is reconstructed from many 3D stripes obtained in stripe laser based stereo vision. But 3D objects can not be recognized by 3D stripe information. In 3D cluttered scene, the recognition of 3D objects is also difficult due to the object pose and match. In fact, the video from camera of stripe laser based stereo vision can be benefit to recognize 3D objects. This paper proposes an approach of the object-oriented vision-guided robot that video segmentation, tracking and recognition are used to guide robot to reduce the complexity of 3D object detection, recognition and pose estimation. Experimental results demonstrate the effectiveness of the approach.
Structured Light
Stereo cameras
Machine Vision
Computer stereo vision
Cite
Citations (2)
Tracking (education)
Cite
Citations (15)
Tracking (education)
Computer stereo vision
Cite
Citations (17)
Moving cameras are needed for a wide range of applications in robotics, vehicle systems, surveillance, etc. However, many foreground object segmentation methods reported in the literature are unsuitable for such settings; these methods assume that the camera is fixed and the background changes slowly, and are inadequate for segmenting objects in video if there is significant motion of the camera or background. To address this shortcoming, a new method for segmenting foreground objects is proposed that utilizes binocular video. The method is demonstrated in the application of tracking and segmenting people in video who are approximately facing the binocular camera rig. Given a stereo image pair, the system first tries to find faces. Starting at each face, the region containing the person is grown by merging regions from an over-segmented color image. The disparity map is used to guide this merging process. The system has been implemented on a consumer-grade PC, and tested on video sequences of people indoors obtained from a moving camera rig. As can be expected, the proposed method works well in situations where other foreground-background segmentation methods typically fail. We believe that this superior performance is partly due to the use of object detection to guide region merging in disparity/color foreground segmentation, and partly due to the use of disparity information available with a binocular rig, in contrast with most previous methods that assumed monocular sequences.
Monocular
Market Segmentation
Binocular disparity
Tracking (education)
Cite
Citations (2)
This article proposes a hand-eye calibration using a new and easy method suitable for a camera mounted on the end-effector of an industrial robot using only a single image. The hand-eye calibration information could be used in robotic picking up of cubes using a monocular camera. Images captured from a particular pose of the camera have been segmented using a fusion of multiple methods such that the object information is obtained even in cases when there is less contrast between the object and the background, or in the presence of variation in lighting. The edge information, and subsequently the pose of the object was estimated using minimum number of images. In some of the cases a single image was sufficient but in case only a single edge edge is obtained, an additional image is grabbed after aligning the camera with the detected edge. An additional edge is estimated using a directional thresholding operation. The edge information in 3-D obtained using the calibration information was then used to calculate the pose of the object to facilitate robotic pick up. To ensure safety; a verification of the estimate was done using projection of the computed coordinates, and final pick up was done while monitoring the force to avoid damage due to collisions. The proposed approaches were physically implemented and experimentally validated.
Monocular
Cite
Citations (6)
In order to improve the accuracy and efficiency of robot grasping, we propose a new method for transparent object detection and location that utilize depth image, RGB image and IR image. In detection process, an active depth sensor (RealSense) is firstly employed to retrieve the transparent candidates from the depth image and the corresponding candidates in the RGB image and IR image are then extracted separately. A transparent candidate classification algorithm is subsequently presented that uses SIFT features to recognize the transparent ones from the candidates. In location process, we obtain a new group of RGB images and IR images by adjusting camera orientation to make its optical axis perpendicular to the normal direction of the plane on which the object is placed. The object contours in RGB image and IR image are then extracted, respectively. The three-dimensional object is finally reconstructed by means of stereo matching of the two contours, and the current pose information of the object is calculated in the end. In order to verify the feasibility of the method, we built a hand-eye test system with a movable industrial robot to detect and capture transparent objects at different locations. The final test results demonstrate that the method is more general and effective than the traditional one.
RGB color model
Scale-invariant feature transform
Cite
Citations (27)
The nationally-recognized Susquehanna
Chorale will delight audiences of all
ages with a diverse mix of classic and
contemporary pieces. The ChoraleAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂA¢AÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂs
performances have been described
as AÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂA¢AÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂemotionally unfiltered, honest
music making, successful in their
aim to make the audience feel,
to be moved, to be part of the
performance - and all this while
working at an extremely high
musical level.AÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂA¢AÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂAÂA Experience choral
singing that will take you to new
heights!
Cite
Citations (0)