Multilingual Speech Recognition Training and Adaptation with Language-Specific Gate Units

2018 
Multi-task learning (MTL) framework is usually adopted to build automatic speech recognition (ASR) system for multiple languages. The shared-hidden-layer multilingual deep neural network (SHL-MDNN), where different languages share the common hidden layers while the output layers are different according to the language-specific context-dependent states sets, brings significant improvement over conventional monolingual baselines. It is hopeful that language-specific adaptation techniques can be applied on the SHL-MDNN to further improve the recognition accuracy of multilingual speech recognizer. In this paper, language-specific gate units (LGU) are added to the specific output layers of different languages. The LGU takes advantage of language identify information, which is obtained from the language identification branch of the multi-task framework. And the language identify information is injected to the output layers of corresponding speech recognition branch. Experiments show that the LGU adaptation on the multilingual neural network provides competitive performance compared to conventional shared-hidden-layer multilingual neural network. And the multilingual speech recognition system achieves a maximum absolute improvement of 7.04% compared to the baseline system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    1
    Citations
    NaN
    KQI
    []