Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering.

Wei Yang,Yuqing Xie,Luchen Tan,Kun Xiong,Ming Li,Jimmy J. Lin

Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering.

2019

Wei Yang
Yuqing Xie
Luchen Tan
Kun Xiong
Ming Li
Jimmy J. Lin

Recently, a simple combination of passage retrieval using off-the-shelf IR techniques and a BERT reader was found to be very effective for question answering directly on Wikipedia, yielding a large improvement over the previous state of the art on a standard benchmark dataset. In this paper, we present a data augmentation technique using distant supervision that exploits positive as well as negative examples. We apply a stage-wise approach to fine tuning BERT on multiple datasets, starting with data that is "furthest" from the test data and ending with the "closest". Experimental results show large gains in effectiveness over previous approaches on English QA datasets, and we establish new baselines on two recent Chinese QA datasets.

Keywords:

Baseline (configuration management)
Artificial intelligence
Natural language processing
Computer science
Fine-tuning
Question answering
Test data
Exploit
open domain

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations