Learning Monotonic Alignments with Source-Aware GMM Attention

2021 
Transformers with soft attention have been widely adopted in various sequence-to-sequence (Seq2Seq) tasks. Whereas soft attention is effective for learning semantic similarities between queries and keys based on their contents, it does not explicitly model the order of elements in sequences which is crucial for monotonic Seq2Seq tasks. Learning monotonic alignments between input and output sequences may be beneficial for long-form and online inference applications that are still challenging for the conventional soft attention algorithm. Herein, we focus on monotonic Seq2Seq tasks and propose a source-aware Gaussian mixture model attention in which the attention scores are monotonically calculated considering both the content and order of the source sequence. We experimentally demonstrate that the proposed attention mechanism improved the performance on the online and long-form speech recognition problems without performance degradation in offline in-distribution speech recognition.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []