Citation Field Learning by RNN with Limited Training Data

Yiqing Zhang,Yimeng Dai,Jianzhong Qi,Xinxing Xu,Rui Zhang

Citation Field Learning by RNN with Limited Training Data

2018

Citation field learning is to segment a citation string into fields of interest such as author, title, and venue from plain text. We are interested in citation field learning from researchers’ homepages. This task is challenging due to the free citation styles used by different creators of the homepages. We aim to address the challenge by neural network based approaches which learn the citation field styles automatically. Neural network based approaches are data-hungry, but manually labeled training data is expensive to obtain. Therefore, we propose a novel framework that utilizes auto-generated training data and domain adaptation to enhance a manually labeled training dataset of limited size. At the same time, we design an adaptive Recurrent Neural Network (RNN) to learn citation styles from the enhanced training data effectively. Extensive experiments show that the proposed methods outperform state-of-the-art methods for citation field learning.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations