Today, 360 $°$ video has become an integral part of people's lives. Despite the fact that the latest generation standard Versatile Video Coding (VVC) demonstrates a significant gain in encoding capacity over High Efficiency Video Coding (HEVC), it still has room for 360 $°$ video encoding improvements. To further enhance the applicability of 360 $°$ video coding, an optimized rate control (RC) algorithm in VVC for 360 $°$ video is proposed in this paper. We present an efficient extraction algorithm for obtaining the video's saliency feature. Furthermore, for the characteristics of 360 $°$ video, a partitioning algorithm is also proposed to divide a frame into demand and non-demand regions. Additionally, to achieve precise and rational RC, a Coding Tree Unit (CTU)-level bit allocation strategy is proposed based on the saliency feature for the above-mentioned regions. The experimental results show that the proposed RC algorithm can achieve 11.77 $\%$ bitrate savings and more accurate allocation compared with the default algorithm of VVC. Also, performance enhancement has been observed in comparison to the most advanced algorithm.
Digital rock imaging plays an important role in studying the microstructure and macroscopic properties of rocks, where microcomputed tomography (MCT) is widely used. Due to the inherent limitations of MCT, a balance should be made between the field of view (FOV) and resolution of rock MCT images-a large FOV at low resolution (LR) or a small FOV at high resolution (HR). However, large FOV and HR are both expected for reliable analysis results in practice. Super-resolution (SR) is an effective solution to break through the mutual restriction between the FOV and resolution of rock MCT images, for it can reconstruct an HR image from a LR observation. Most of the existing SR methods cannot produce satisfactory HR results on real-world rock MCT images. One of the main reasons for this is that paired images are usually needed to learn the relationship between LR and HR rock images. However, it is challenging to collect such a dataset in a real scenario. Meanwhile, the simulated datasets may be unable to accurately reflect the model in actual applications. To address these problems, we propose a cycle-consistent generative adversarial network (CycleGAN)-based SR approach for real-world rock MCT images, namely, SRCycleGAN. In the off-line training phase, a set of unpaired rock MCT images is used to train the proposed SRCycleGAN, which can model the mapping between rock MCT images at different resolutions. In the on-line testing phase, the resolution of the LR input is enhanced via the learned mapping by SRCycleGAN. Experimental results show that the proposed SRCycleGAN can greatly improve the quality of simulated and real-world rock MCT images. The HR images reconstructed by SRCycleGAN show good agreement with the targets in terms of both the visual quality and the statistical parameters, including the porosity, the local porosity distribution, the two-point correlation function, the lineal-path function, the two-point cluster function, the chord-length distribution function, and the pore size distribution. Large FOV and HR rock MCT images can be obtained with the help of SRCycleGAN. Hence, this work makes it possible to generate HR rock MCT images that exceed the limitations of imaging systems on FOV and resolution.
Accurately acquiring the three-dimensional (3D) image of a porous medium is an imperative issue for the prediction of multiple physical properties. Considering the inherent nature of the multiscale pores contained in porous media such as tight sandstones, to completely characterize the pore structure, one needs to scan the microstructure at different resolutions. Specifically, low-resolution (LR) images cover a larger field of view (FOV) of the sample, but are lacking small-scale features, whereas high-resolution (HR) images contain ample information, but sometimes only cover a limited FOV. To address this issue, we propose a method for fusing the spatial information from a two-dimensional (2D) HR image into a 3D LR image, and finally reconstructing an integrated 3D structure with added fine-scale features. In the fusion process, the large-scale structure depicted by the 3D LR image is fixed as background and the 2D image is utilized as training image to reconstruct a small-scale structure based on the background. To assess the performance of our method, we test it on a sandstone scanned with low and high resolutions. Statistical properties between the reconstructed image and the target are quantitatively compared. The comparison indicates that the proposed method enables an accurate fusion of the LR and HR images because the small-scale information is precisely reproduced within the large one.
JPEG is one of the widely used lossy compression methods. JPEG-compressed images usually suffer from compression artifacts including blocking and blurring, especially at low bit-rates. Soft decoding is an effective solution to improve the quality of compressed images without changing codec or introducing extra coding bits. Inspired by the excellent performance of the deep convolutional neural networks (CNNs) on both low-level and high-level computer vision problems, we develop a dual pixel-wavelet domain deep CNNs-based soft decoding network for JPEG-compressed images, namely DPW-SDNet. The pixel domain deep network takes the four downsampled versions of the compressed image to form a 4-channel input and outputs a pixel domain prediction, while the wavelet domain deep network uses the 1-level discrete wavelet transformation (DWT) coefficients to form a 4-channel input to produce a DWT domain prediction. The pixel domain and wavelet domain estimates are combined to generate the final soft decoded result. Experimental results demonstrate the superiority of the proposed DPW-SDNet over several state-of-the-art compression artifacts reduction algorithms.