language-icon Old Web
English
Sign In

ASR n-best fusion nets

2021 
Current spoken language understanding systems heavily rely on the best hypothesis (ASR 1-best) generated by automatic speech recognition, which is used as the input for downstream models such as natural language understanding (NLU) modules. However, the potential errors and misrecognition in ASR 1-best raise challenges to NLU. It is usually difficult for NLU models to recover from ASR errors without additional signals, which leads to suboptimal SLU performance. This paper proposes a fusion network to jointly consider ASR n-best hypotheses for enhanced robustness to ASR errors. Our experiments on Alexa data show that our model achieved 21.71% error reduction compared to baseline trained on transcription for domain classification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    2
    Citations
    NaN
    KQI
    []