A Speech Reconstruction Algorithm via Iteratively Reweighted ℓ 2 Minimization for MFCC Codec
2019
This paper presents an effective method to address the inverse problem of Mel-frequency cepstral analysis, and describes how to reconstruct the speech waveforms from Mel-frequency cepstral coefficients (MFCCs) directly. To exploit the sparse characteristics of speech in the frequency domain, an iteratively reweighted $\ell_{2}$ minimization method is proposed to cope with the under-determined nature of the reconstruction problem. The lost phase information during Mel-frequency cepstral analysis procedure is recovered by the inverse short-time Fourier transform magnitude algorithm. Experiments are conducted over the TIMIT database and evaluated by several different kinds of measures. Experimental results demonstrate that the proposed method recovers speech with high articulation and intelligibility. Specifically, it sounds very close to the original speech when using the high-resolution MFCCs, the average STOI, PESQ score reaches 93% and 4.0, respectively. This method could be easily used for MFCC codec at low bit rate.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
0
Citations
NaN
KQI