This paper describes the progress in frequency-domain linear prediction coding (LPC)-based audio coding schemes.Although LPC was originally used only for time-domain speech coders, it has been applied to frequency-domain coders since the late 1980s.With the progress in associated technologies, the frequency-domain LPC-based audio coding scheme has become more promising, and it has been used in speech/audio coding standards, such as MPEG-D unified speech and audio coding and 3GPP enhanced voice services since 2010.Three of the latest investigations on the representations of LPC envelopes in frequency-domain coders are shown.These are the harmonic model, frequency-resolution warping and the Powered All-Pole Spectral Envelope, all of which are aiming at further enhancement of the coding efficiency.
A general class of the almost instantaneous fixed-to-variable-length (AIFV) codes is proposed, which contains every possible binary code we can make when allowing finite bits of decoding delay. The contribution of the paper lies in the following. (i) Introducing $N$-bit-delay AIFV codes, constructed by multiple code trees with higher flexibility than the conventional AIFV codes. (ii) Proving that the proposed codes can represent any uniquely-encodable and uniquely-decodable variable-to-variable length codes. (iii) Showing how to express codes as multiple code trees with minimum decoding delay. (iv) Formulating the constraints of decodability as the comparison of intervals in the real number line. The theoretical results in this paper are expected to be useful for further study on AIFV codes.
This paper presents an optimal construction of $N$-bit-delay almost instantaneous fixed-to-variable-length (AIFV) codes, the general form of binary codes we can make when finite bits of decoding delay are allowed. The presented method enables us to optimize lossless codes among a broader class of codes compared to the conventional FV and AIFV codes. The paper first discusses the problem of code construction, which contains some essential partial problems, and defines three classes of optimality to clarify how far we can solve the problems. The properties of the optimal codes are analyzed theoretically, showing the sufficient conditions for achieving the optimum. Then, we propose an algorithm for constructing $N$-bit-delay AIFV codes for given stationary memory-less sources. The optimality of the constructed codes is discussed both theoretically and empirically. They showed shorter expected code lengths when $N\ge 3$ than the conventional AIFV-$m$ and extended Huffman codes. Moreover, in the random numbers simulation, they performed higher compression efficiency than the 32-bit-precision range codes under reasonable conditions.
This paper presents extended-domain Golomb (XDG) code, an extension of Golomb code for sparse geometric sources as well as a generalization of extended-domain Golomb-Rice (XDGR) code, based on the idea of almost instantaneous fixed-to-variable length (AIFV) codes. Showing that the XDGR encoding can be interpreted as extended usage of the code proposed in the previous works, this paper discusses the following two facts: The proposed XDG code can be constructed as an AIFV code relating to Golomb code as XDGR code does to Rice code; XDG and Golomb codes are symmetric in the sense of relative redundancy. The proposed XDG code can be efficiently used for losslessly compressing geometric sources too sparse for the conventional Golomb and Rice codes. According to the symmetry, its relative redundancy is guaranteed to be as low as Golomb code compressing non-sparse geometric sources. Awing to this fact, the parameter of the proposed XDG code, which is more finely tunable than the conventional XDGR code, can be optimized for given inputs using the conventional techniques. Therefore, it is expected to be more useful for many coding applications that deal with geometric sources at low bit rates.
This paper presents a qualitative approach of combining Golomb-Rice (GR) code with algebraic bijective mappings which losslessly convert between arbitrary positive integers of different dimension and shape the distribution of generalized Gaussian sources. The mappings, integer nesting and splitting, enables GR encoding, with a little additional computation, to compress more efficiently sources based on wider classes of distributions than Laplacian. Simulations showed, especially for some Gaussian sources, almost optimal average code length can be achievable by performing integer nesting before GR encoding the integers. This scheme will be useful for applications dealing with various types of sources and requiring low computational costs.
We have devised a method for estimating, from a single frame of audio frequency spectra, a shape parameter of multivariate generalized Gaussian distribution which has variance represented by an all-pole model and no covariance. Based on powered all-pole spectrum estimation (PAPSE), which is an extension of linear prediction, the proposed method simultaneously estimates the shape parameter and the maximum-likelihood variance, allowing more accurate representation of the probability density functions of the spectra. This paper shows an integration of the estimation into an audio codec for an example of its application, which resulted in the enhancement of the objective and subjective reconstruction quality. Since this estimation method provides us with simple parameters which reflect some acoustic features of signals, the method may also be useful in other audio signal processing problems.
A general class of the almost instantaneous fixed-to-variable-length (AIFV) codes is proposed, which contains every possible binary code we can make when allowing finite bits of decoding delay. The contribution of the paper lies in the following. (i) Introducing $N$ -bit-delay AIFV codes, constructed by multiple code trees with higher flexibility than the conventional AIFV codes. (ii) Proving that the proposed codes can represent any uniquely-encodable and uniquely-decodable variable-to-variable length codes. (iii) Showing how to express codes as multiple code trees with minimum decoding delay. (iv) Formulating the constraints of decodability as the comparison of intervals in the real number line. The theoretical results in this paper are expected to be useful for further study on AIFV codes.