Improving Real-time Recognition of Morphologically Rich Speech with Transformer Language Model

2020 
Transformer models have become to state-of-the-art in natural language understanding, their use for language modeling in Automatic Speech Recognition (ASR) is also promising. Albeit Transformer based language models were shown to improve ASR performance, their computational complexity makes their application in real-time systems quite challenging. It has been also shown that the knowledge of such language models can be transferred to traditional n-gram models, suitable for real-time decoding. This paper investigates the adaptation of this transfer approach to morphologically rich languages, and in a real time scenario. We propose a new method for subword-based neural text augmentation with a Transformer language model, which consists in retokenizing the training corpus into subwords, based on a statistical data-driven approach. We demonstrate that ASR performance can be augmented by yet reducing the vocabulary size and alleviating memory consumption.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []