Ergodic hidden control neural network for modelling of the speech process

1993 
The authors deal with the extension of the hidden control neural network (HCNN) architecture to the ergodic case, i.e., if all the control state sequences are allowed. This scheme gives a deeper understanding of the modeling capabilities offered by the HCNN formalism. In fact, the control input binary digits status can be considered as the presence/absence of a posteriori defined binary phonetic features, forcing the network to produce a low prediction error on pairs of speech frames. Major improvements of the technique have been found after normalization of the output vector components by the prediction error standard deviations. Other improvements arise from the extension to a second order prediction, and an appropriate pruning of the allowed control states transition matrix. Rewiring of the original architecture as a recurrent network allows for the resynthesis of smooth spectral trajectories, once the recurrent network is fed by the optimal control sequence found by dynamic programming when matching real speech against the HCNN control input. >
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    2
    Citations
    NaN
    KQI
    []