Deep neural network-based speech recognition with combination of speaker-class models

2015 
This paper proposes a new speech recognition method based on speaker-class (SC) models. In previous studies based on this approach, Gaussian-mixture-model-based hidden Markov models (GMM-HMMs) have mainly been used as acoustic models. In this work, SC models that have deep neural network (DNN)-based HMM (DNN-HMM) structures are investigated and used for speaker-independent (SI) speech recognition. To realize SI speech recognition based on SC models, technological challenges must be solved so that unsupervised adaptation can be performed with only one utterance. To address this problem, we propose a new method of combining DNN outputs. In our experiments, five of 963 SC models were selected automatically, and DNN-HMM-based SC models were combined for each utterance. The results showed that the proposed method outperformed a baseline DNN-HMM system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    1
    Citations
    NaN
    KQI
    []