Improved automatic language identification in noisy speech

International Conference on Acoustics, Speech, and Signal Processing (2003)

Citation

Reference

Related Paper

Citation Trend

Abstract:

A description is given of enhancements to an automatic language identification algorithm previously reported. The algorithm, based on linear-predictive-coding-based formant extraction, was greatly improved, reducing the error rate by more than 50%. This performance was achieved on a large (>9 h), very noisy, six-language database, using trials of less than 10 s. Experiments that improved performance are described, including tests of various distance metrics, expanded and modified parameter sets, and a new voicing statistic. Final performance results were obtained as a function of time, signal-to-noise ratio, and no-decision rate. A new rejection capability was developed to address the open-set identification problem.< >

Keywords:

Identification

Word error rate

Statistic

Topics:

Natural Language Processing Techniques

Speech Recognition and Synthesis

Speech and dialogue systems

10.1109/icassp.1989.266480

Cite

Optimization implementation of the linear predictive coding analysis for speech signal on DSP

Kun Gu Geng Zhao

The Linear Predictive Coding(PLC) analysis of speech signal is an important part in speech(coding,) it takes the most time in speech coder calculating,in order to improve the speed of speech coding,LPC analysis was implemented in assembly language based on the C54x DSP and optimized in this paper.Compared with the original method,the run time of the coder can be reduced to 5%.

Linear prediction

Predictive coding

Source

Cite

Citations (0)

An improved algorithm for residual signal excitation based on LPC 10

2022 7th International Conference on Communication, Image and Signal Processing (CCISP) (2022)

Xin Yu Xingyuan You Xiaoling Liu Chuan Li

Under narrowband shortwave communication conditions, digital speech coding is mostly in the form of low-rate linear predictive coding, but LPC parametric coding recovers low naturalness of speech with buzz. In this paper, we propose a method to improve the residual signal excitation based on LPC10. At the coding end, the prediction coefficients are solved based on linear prediction analysis, and the original speech is inverse filtered based on the prediction coefficients and differs from the original speech signal to obtain the residual signal; at the decoding end, the original muffled pulse excitation is replaced with the residual signal, and the improved synthesized speech improves the hum in the original LPC synthesized speech. The generated speech and the original speech are scored by PESQ algorithm, and the result showed that the improved speech score is 1.68, which is 0.34 points higher than the LPC 10 synthesized speech score.

Linear prediction

SIGNAL (programming language)

Codec2

10.1109/ccisp55629.2022.9974264

Cite

Citations (1)

Improved linear predictive speech coding technique and realizes

Computer Engineering and Applications Journal (2009)

Liu Gui-bin

The Linear Prediction Coding (LPC) is an important technology to realize speech coding.In this paper,the linear prediction coding technology’s realization is introduced,and an improved sound excited linear prediction speech coding method is recommended.Finally,this paper compares the simple LPC speech coding and the sound excited LPC speech coding.The experimental results demonstrate that the method can achieve speech coding well,moreover the sound effect is more ideal than simple LPC.

Linear prediction

Predictive coding

Source

Cite

Citations (0)

A comparative analysis on cepstrum, linear predictive coding and particle filtering based formant estimation methods

Mustafa Anıl Reşat Halil Ibrahim Gokcimen Umut Arıöz

Formants are able to define basic properties of speech efficiently by using very limited parameter sets; thus they have found important usage area at many applications of speech processing like coding, recognition, synthesis and enhancement. Estimation of formants is harder than simply tracking the peaks of the spectrum; as the output of the vocal tract's spectral peaks are dependent on the shape of vocal tract, excitation and periodicity in a complex way. Because of this reason, a lot of past work was done on formant estimation and their positive and negative properties have been recognized. In this article we analyzed some of the popular formant estimation method's performances and compared them. Among these three compared methods, it's seen that the particle filtering based formant estimation method gives the most successful performance. Furthermore, it's recognized that linear predictive coding method has estimation difficulties with signals with low sampling frequencies and cepstrum method causes excess formants at peak picking.

Vocal tract

Cepstrum

Mel-frequency cepstrum

Linear prediction

10.1109/siu.2014.6830241

Cite

Citations (0)

Design and implementation of a parametric speech coder

IEEE Transactions on Consumer Electronics (1998)

Sam Kwong P.T. Nui

Speech coders based on linear predictive coding are good and accurate in modeling speech utterances. They not only model speech utterance in an accurate manner but have the following properties: i) they provide a good estimate of the vocal tract spectral envelope; ii) they are analytically tractable; iii) they can be implemented in either software or hardware; iv) they use less data storage than many other approaches of speech coders. In this paper, a hybrid speech coder based on code excited linear predictive coding (CELPC) and voice excited linear predictive coding (VELPC) is presented. CELPC is one of the main techniques for producing high quality speech at around 4.8 kbps. However, the computation requirement for the original CELPC was too demanding and not suitable for practical use. Therefore, a hybrid approach to produce speech signals with high quality and with a bit rate of 3.478 kbps has been proposed. This method splits the speech signal into two portions, the base-band signal and the high-band signal in the frequency domain. The base-band and the high-band signal are then coded using the CELPC and VELPC techniques, respectively.

Codec2

Linear prediction

Spectral envelope

10.1109/30.663743

Cite

Citations (1)

Speech synthesis based on AMR-WB algorithm

Chang Shu Jinshuo Mei Jinghua Yin

The linear predictive coding synthesis (LPC) parameters are extracted by using 23.05 k bit/s mode of adaptive multi-Rate wideband speech coding technology (AMR-WB). The synthesis results show that the speech synthesized are similar to the sample speech in time and frequency domain, in addition, the synthesized speech sound like sample speech well. The speech synthesized can well retrieve the characteristics of the sample, and gain good synthesis results. The algorithm meets the requirements of speech synthesis system which needs lower decoder complexity. So it is feasible that the AMR-WB algorithm were applied to linear predictive coding speech synthesis to synthesize the speech.

Codec2

Linear prediction

Predictive coding

Wideband audio

10.1109/emeit.2011.6023479

Cite

Citations (0)

CELP Coding for high-quality speech at 8 kbit/s

M. Copperi D. Sereno

A new speech coding technique at low bit-rate is presented in this paper. The coder is based upon a novel speech production model, independently developed by the authors [1,2] and by Atal and Schroeder [3,4], called CELP (Codebook Excited Linear Prediction). Differences exist between the two approaches, both in the strategy chosen to construct codebooks, and in the method to generate the innovation sequence. In this scheme, we split the incoming speech signal into two frequency bands in order to gain the benefits of the piecewise LP (Linear Prediction) approximation. Then, each residual signal is coded in blocks of 5-ms duration through an adaptive vector quantizer incorporating a noise shaping filter. Our results show that good quality speech can be obtained at 8 kbit/s.

Codec2

Linear prediction

Full Rate

10.1109/icassp.1986.1169257

Cite

Citations (9)

An Algorithm for Extraction of Speech Spectral Envelope Using Piecewise Linear Predictive Coding

Audio Engineering (2008)

Wan Mao-wen

In speech analysis, it′s very important for the extraction of speech spectral envelope. An algorithm is presented using piecewise LPC(Linear Predictive Coding). The theory of filter banks is used as well as LPC. The spectral coverage of speech is divided, and then LPC is applied to different spectral bands. Experiments suggest that the resolving capability of synthetic spectra is improved.

Spectral envelope

Predictive coding

Linear prediction

Envelope (radar)

Spectral Analysis

Source

Cite

Citations (0)

Synthesis of Speech Signal Based on Linear Prediction

Journal of Northwest University for Nationalities (2010)

Yanping He

Linear prediction coding is an important technology to realize speech coding.By studing the speech signal and LPC,the linear predictive analysis principles of speech signal was introduced,the autocorrelation and calculation method of solving linear predictive equation were analyzed in detail,and does the linear predictive experiments were carired out on the practical speech signal with matlab.The experimental results demonstrated that the speech signal of using the linear predictive coding had few errors,simple calculation and fast synthesis.

Linear prediction

Predictive coding

SIGNAL (programming language)

Source

Cite

Citations (0)

Mixed spectral representation—Formants and linear predictive coding

The Journal of the Acoustical Society of America (1992)

Joseph P. Olive

A cascade formant model is well suited to describe certain speech segments, such as vowels and vowel-like sounds. The formant model is also useful because the relationship between formants and the vocal tract configurations are well understood; however, this model is not adequate for other speech sounds, such as stops, fricatives, nasals, etc. On the other hand, LPC (linear predictive coding) analysis, of sufficiently high order, can adequately describe the spectrum of any speech sound, but the relationship between the LPC parameters and the spectrum or vocal tract configuration is not obvious. This paper describes a speech analysis/synthesis scheme that uses both formants and LPC parameters for different sections of a speech signal. Thus, in some regions, the benefit of the formant model can be utilized, while in other regions, the LPC representation can be used to obtain a good description of the speech spectrum. The analysis algorithm resolves the problem of discontinuities that arise from using the two different spectral representations. When provided with the same multipulse signal, the method of speech analysis described in this paper produces resynthesized speech of the quality of multipulse LPC.

Vocal tract

Linear prediction

Representation

10.1121/1.403840

Cite

Citations (5)