Initial investigation of speech synthesis based on complex-valued neural networks

Qiong Hu,Junichi Yamagishi,Korin Richmond,Kartick Subramanian,Yannis Stylianou

Initial investigation of speech synthesis based on complex-valued neural networks

2016

Although frequency analysis often leads us to a speech signal in the complex domain, the acoustic models we frequently use are designed for real-valued data. Phase is usually ignored or modelled separately from spectral amplitude. Here, we propose a complex-valued neural network (CVNN) for directly modelling the results of the frequency analysis in the complex domain (such as the complex amplitude). We also introduce a phase encoding technique to map real-valued data (e.g. cepstra or log amplitudes) into the complex domain so we can use the same CVNN processing seamlessly. In this paper, a fully complex-valued neural network, namely a neural network where all of the weight matrices, activation functions and learning algorithms are in the complex domain, is applied for speech synthesis. Results show its ability to model both complex-valued and real-valued data.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations