AliMe DA: A Data Augmentation Framework for Question Answering in Cold-start Scenarios

2021 
Cold-start is the most difficult and time-consuming phase when building a question answering based chatbot for a new business scenario because of the collection of sufficient training data. In this paper, we propose AliMe DA, a practical data augmentation (DA) framework that consists of data production, denoising and consumption, to alleviate this problem. We show how our DA approach can be used to substantially enhance annotation productivity and also improve downstream model performance. More importantly, we provide best practices for data augmentation, including how to choose and employ appropriate methods at each stage of our framework, and share our observation on the applicable scene of data augmentation in the era of pre-trained language models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []