Speech recognition with large-scale speaker-class-based acoustic modeling

2013 
This paper investigates speaker-independent speech recognition with speaker-class models. In previous studies based on this method, the number of speaker classes was relatively small and it was difficult to improve the performance significantly over the baseline. In this work, as many as 500 speaker-class models are used to enable more precise modeling of speaker characteristics. In order to avoid a lack of training data for each speaker-class model, a soft clustering technique is used in which a training speaker is allowed to belong to several classes. In the recognition experiments, a slight improvement in performance was obtained using a conventional method with several tens of speaker-class models. In contrast, a significant improvement was obtained using an unsupervised soft clustering method with several hundred speaker-class models. In addition, the results indicated a possibility of reducing the error rate drastically if the speaker-class model selection was conducted more effectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    2
    Citations
    NaN
    KQI
    []