A Simple Cepstral Domain DNN Approach to Artificial Speech Bandwidth Extension

2018 
In this work, we present a simple deep neural network (DNN)-based regression approach to artificial speech bandwidth extension (ABE) in the frequency domain for estimating missing speech components in the range 4 … 7 kHz. The upper band (UB) spectral magnitudes are found by first estimating the UB cepstrum by means of a DNN regression and subsequent conversion to the spectral domain, leading to a more efficient and generalizing model training rather than estimating highly redundant UB magnitudes directly. As second novelty the phase information for the estimated upper band spectral magnitudes is generated by spectrally shifting the NB phase. Apart from framing, this very simple approach does not introduce additional algorithmic delay. A cross-database and cross-language task is defined for training and evaluation of the ABE framework. In a subjective comparison category rating test, the proposed ABE solution significantly outperforms the competing ABE baseline and was found to improve NB speech quality by 0.80 CMOS points, while the computation time is reduced to about 3 % compared to the ABE baseline.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    11
    Citations
    NaN
    KQI
    []