Visual signal coding and quality evaluation
2011
Visual signal (i.e., images and videos) coding is to compress digital visual data to be as small in size as possible in order to make use of limited bandwidth of networks and cater for compact storage, by exploring various data redundancy. It exploits the redundancy in signal itself (statistical redundancy, i.e., spatial-temporal redundancy and spectral/color redundancy). Since the human visual system (HVS) is the ultimate receiver and appreciator of most processed visual signal, we should also consider the redundancy due to the human vision properties (i.e., perceptual/psycho-visual redundancy) in the course of coding. The effectiveness of image and video coding methods is traditionally evaluated with their rate-distortion (RD) performance where rate is the number of bits required for the compressed visual signal (or its variants such as bits per pixel (bpp) and bits per second) and distortion is usually measured as peak signal to noise ratio (PSNR). However, it has been found that PSNR is not always in accordance with the human judgment and therefore the measurement for perceptual distortion is an active research area.
Firstly, in this work, we discuss the statistical redundancy of video and then propose a novel optimal compression plane (OCP) based video coding scheme. In the sense of data structure, video is nothing more than a three dimensional data matrix, and the distinction among X (a spatial dimension), Y (the other spatial dimension), and T (the temporal dimension) is not absolutely necessary. We ignore the physical meaning of X, Y, and T axes for a video during the video coding process; frames are allowed to be formed in the TX (or TY) plane rather than the traditional XY plane to exploit the redundancy more effectively, and therefore better coder performance is achieved.
Secondly, the model reflecting the masking characteristics of the HVS is studied as it is fundamental for perceptual redundancy exploring and visual distortion (quality) measurement. Just noticeable difference (JND) accounts for various masking effects of the HVS. We improve the pixel domain JND model by better contrast masking (CM) evaluation and appropriately accounting for the difference of CM for textural and edge regions. We also investigate into the application of the perceptual models (i.e., visual attention model and JND model) in the context of adaptive sampling based low-bit-rate image coding and JND based histogram adjustment for visually lossless image coding.
Lastly, an effective and efficient metric of visual quality/distortion evaluation is proposed. The metric is based on the similarity between the gradient profiles of the reference and distorted signals which accounts for both the high level premise of the HVS (i.e., high sensitivity to image edges and structure) and the masking property. This new metric is with simple calculation and high accuracy (verified with extensive cross-database tests); it is robust to various distortion types and can be easily embedded in coding systems (as well as other visual signal…
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
7
References
0
Citations
NaN
KQI