Fftnet: A Real-Time Speaker-Dependent Neural Vocoder

2018 
We introduce FFTNet, a deep learning approach synthesizing audio waveforms. Our approach builds on the recent WaveNet project, which showed that it was possible to synthesize a natural sounding audio waveform directly from a deep convolutional neural network. FFTNet offers two improvements over WaveNet. First it is substantially faster, allowing for real-time synthesis of audio waveforms. Second, when used as a vocoder, the resulting speech sounds more natural, as measured via a “mean opinion scoretest.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    85
    Citations
    NaN
    KQI
    []