logo
    Efficient Depth Intra Frame Coding in 3D-HEVC by Corner Points
    15
    Citation
    37
    Reference
    10
    Related Paper
    Citation Trend
    Abstract:
    To improve the coding performance of depth maps, 3D-HEVC includes several new depth intra coding tools at the expense of increased complexity due to a flexible quadtree Coding Unit/Prediction Unit (CU/PU) partitioning structure and a huge number of intra mode candidates. Compared to natural images, depth maps contain large plain regions surrounded by sharp edges at the object boundaries. Our observation finds that the features proposed in the literature either speed up the CU/PU size decision or intra mode decision and they are also difficult to make proper predictions for CUs/PUs with the multi-directional edges in depth maps. In this work, we reveal that the CUs with multi-directional edges are highly correlated with the distribution of corner points (CPs) in the depth map. CP is proposed as a good feature that can guide to split the CUs with multi-directional edges into smaller units until only single directional edge remains. This smaller unit can then be well predicted by the conventional intra mode. Besides, a fast intra mode decision is also proposed for non-CP PUs, which prunes the conventional HEVC intra modes, skips the depth modeling mode decision, and early determines segment-wise depth coding. Furthermore, a two-step adaptive corner point selection technique is designed to make the proposed algorithm adaptive to frame content and quantization parameters, with the capability of providing the flexible tradeoff between the synthesized view quality and complexity. Simulation results show that the proposed algorithm can provide about 66% time reduction of the 3D-HEVC intra encoder without incurring noticeable performance degradation for synthesized views and it also outperforms the previous state-of-the-art algorithms in term of time reduction and $\Delta $ BDBR.
    Keywords:
    Quadtree
    Depth map
    Stereo vision refers to the ability to infer information on the 3D structure of scene from two or more images taken from different viewpoints. This paper describes procedure for depth map creating using rectified stereo images and segmentation algorithm belief propagation (BP). Very necessary steps to creating depth map are camera calibration and image rectification of the image pairs. Calibration of the stereoscopic cameras consists from a two parameters of a stereo system: intrinsic parameters, which characterize the transformation mapping an image point from camera to pixel coordinates in each camera and extrinsic parameters, which describe the relative position and orientation of the two cameras. The depth recovery is important problem of image analysis in the study of compute vision and is optimized by using the belief propagation techniques.
    Epipolar geometry
    Computer stereo vision
    Depth map
    Belief Propagation
    Position (finance)
    Image rectification
    Stereo cameras
    Among all the techniques for 3D acquisition, stereo vision systems are the most common. More recently, Time-of-Flight (ToF) range cameras have been introduced. They allow real-time depth estimation in conditions where stereo does not work well. Unfortunately, ToF sensors still have a limited resolution (e.g., 200 × 200 pixels). The goal of this paper is to combine the information from the ToF with one or two standard cameras, in order to obtain a highresolution depth image. To this end, we propose a new bilateral filter for depth maps up-sampling, which exploits the additional information provided by a high-resolution color image. Moreover, we present an entire framework for the super-resolution of a real ToF depth map with a single camera color image. The algorithm allows to enhance the resolution of the ToF camera depth image up to 1920 × 1080 pixels.
    Depth map
    Stereo imaging
    Citations (3)
    In this paper, we propose a hybrid 2D-to-3D video conversion system to recover the 3D structure of the scene. Depending on the scene characteristics, geometric or height depth information is adopted to form the initial depth map. This depth map is fused with color-based depth cues to construct the nal depth map of the scene background. The depths of the foreground objects are estimated after their classi cation into human and non-human regions. Speci cally, the depth of a non-human foreground object is directly calculated from the depth of the region behind it in the background. To acquire more accurate depth for the regions containing a human, the estimation of the distance between face landmarks is also taken into account. Finally, the computed depth information of the foreground regions is superimposed on the background depth map to generate the complete depth map of the scene which is the main goal in the process of converting 2D video to 3D.
    Depth map
    Monocular
    Citations (1)
    A new video format called a multiview video plus depth map is recently designed as the most efficient 3D video representation. 3D-high efficiency video coding (3D-HEVC) is the international standard for 3D video coding finalized in 2016. In 3D-HEVC intracoding, the depth map is an essential component, in which the depth map intraprediction occupies more than 85% of the overall intraencoding time. In 3D-HEVC design, a highly flexible quadtree coding unit partitioning is adopted, in which one coding tree unit is partitioned recursively into a prediction unit (PU) from 64 × 64 down to 4 × 4. This highly flexible partitioning provides more accurate prediction signals and thereby achieves better intradepth map compression efficiency. However, performing all depth map intraprediction modes for each PU level achieves high depth map intracoding efficiency, but it results in a substantial computational complexity increase. This paper proposes an amelioration of the previously proposed depth map PU size decision using an efficient homogeneity determination. The resulting experiences show that the proposed method can significantly save the computational complexity with a negligible loss of intracoding efficiency.
    Depth map
    Quadtree
    Algorithmic efficiency
    Citations (6)
    Due to the problems of noise, textureless region and depth discontinuity in stereo matching, a new matching method based on two cameras and one 3D image sensor is proposed in this paper. Though the 3D image sensor can offer a depth map, it has low resolution and much noise. Therefore, it can not be used to 3D reconstruction. However, combining two cameras with one 3D image sensor and regarding the depth map as an initial sparse disparity map is an advantage method in stereo vision matching. This method can largely improve the matching accuracy and decrease the running time. Finally, a dense disparity map can be obtained. The experiment results indicate that the proposed algorithm performs well and the disparity map has more accuracy comparing with existing methods.
    Depth map
    Computer stereo vision
    Stereo cameras
    Discontinuity (linguistics)
    3D Reconstruction
    Citations (0)
    Although depth information of a scene is not stored while taking a picture, some depth information is retained in the captured image. A method is proposed for generating the depth map of a single image based on different scene categories. The image is first classified into a category based on color and texture information, and then the depth map of the image is generated according to the different scene categories. The depth map can be used to generate stereo binocular image pairs by left- and right-shifting the original image. Then, the stereoscopic image with three-dimensional (3-D) visual effect can be viewed through a 3-D stereo display. The experiments showed that the proposed method works well, producing a satisfactory stereoscopic effect.
    Depth map
    Stereo image
    Binocular disparity
    Computer stereo vision
    Citations (6)
    In 3D-HEVC, single depth intra mode has been applied and has been integrated into depth intra skip mode for efficient depth map coding. With single depth intra mode, one 2N×2N prediction unit (PU) is predicted without high computational prediction process. In this paper, we propose a fast single depth intra mode decision method to address the problem of high computational complexity burden in depth intra mode decision of 3D-HEVC. To remove unnecessary computational complexity at the encoder, we early decide single depth intra mode for pruning quadtree in 3D-HEVC. This paper characterizes the statistics of smooth depth map signals for depth intra modes and analyzes distortion metrics of view synthesis optimization functionality as a decision criterion. With this proposed criterion, a single depth intra mode for intra coding has been detected and hierarchical CU/PU selection for intra coding can be stopped in 3D-HEVC. As a consequence, it utilizes the correlation between hierarchical block-based video coding and coding unit (CU)/PU mode decision for depth map coding so that a large number of recursive rate-distortion cost calculations can be skipped. We demonstrate the effectiveness of our approach experimentally. The simulation results show that the proposed scheme can achieve approximately 25.6% encoding time saving with 0.07% video PSNR/total bitrate gain and 0.18% synthesized view PSNR/total bitrate loss under all intra configuration.
    Depth map
    Quadtree
    Algorithmic efficiency
    View synthesis
    Macroblock
    Citations (11)
    This paper presents a novel multi-depth map fusion approach for the 3D scene reconstruction. Traditional stereo matching techniques that estimate disparities between two images often produce inaccurate depth map because of occlusion and homogeneous area. On the other hand, Depth map obtained from the depth camera is globally accurate but noisy and provides a limited depth range. In order to compensate pros and cons of these two methods, we propose a depth map fusion method that fuses the multi-depth maps from stereo matching and the depth camera. Using a 3-view camera system that includes a depth camera for the center-view, we first obtain 3-view images and a depth map from the center-view depth camera. Then we calculate camera parameters by camera calibration. Using the camera parameters, we rectify left and right-view images with respect to the center-view image for satisfying the well-known epipolar constraint. Using the center-view image as a reference, we obtain two depth maps by stereo matching between the center-left image pair and the center-right image pair. After preprocessing each depth map, we pick an appropriate depth value for each pixel from the processed depth maps based on the depth reliability. Simulation results obtained by our proposed method showed improvements in some background regions.
    Epipolar geometry
    Depth map
    Citations (13)