Efficient multi-lingual unsupervised acoustic model training under mismatch conditions

2014 
We propose a new multi-lingual unsupervised acoustic model (AM) training method for low-resourced languages under mismatch conditions. In those languages, there is very limited or no transcribed speech. Thus, unsupervised acoustic modeling using AMs of different languages (not low-resourced languages) has been proposed. The conventional method has shown to be effective for similar acoustic conditions, such as speaking-style, between a low-resourced language and different languages. However, since it is not easy to prepare the matched AMs of different languages, mismatch problem between each AM and the speech of a low-resourced language for unsupervised acoustic modeling is practically occurred. In this paper, we deal with this mismatch problem. To generate more accurate automatic transcriptions under mismatch conditions, we introduce two things: (1) Initial AMs were trained with speech of different languages that was mapped to the phonemes of a low-resourced language and (2) Iterative process to switch back and forth between training of AMs and adaptation of the initial AMs. The proposed method without any transcriptions achieved a word error rate of 32.1% on the evaluation set of IWSLT2011, while the word error rates of the conventional method and the supervised training method were 39.3 and 22.7%, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    2
    Citations
    NaN
    KQI
    []