Online Speaker Adaptation for LVCSR Based on Attention Mechanism

2018 
Speaker adaptation is one of the most popular and important topics for speech recognition. In this paper, we propose a novel online speaker adaptation technique for deep neural networks based large vocabulary automatic speech recognition (LVCSR). In this approach, the i-vectors of the speakers in training set are extracted as a static memory. For each frame, attention mechanism is used to select the most relevant speaker i-vectors to the current speech segment from the memory. We also propose a new attention mechanism to improve the performance. The vectors obtained by the attention mechanism provide speaker information for improving the accuracy of speech recognition. Experiments on the Switchboard task show that the proposed approach achieves a relative 8.3% word error rate (WER) reduction over speaker independent model without any adaptation data. The result is comparable to that of the popular i-vector based offline speaker adaption method and is much better than that of the i-vector based online speaker adaption method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    10
    Citations
    NaN
    KQI
    []