Distant Supervision for Relation Extraction via Noise Filtering

2021 
As a widely used method in relation extraction at the present stage suggests, distant supervision is affected by label noise. The data noise is introduced artificially due to the theory and the performance of distant supervision will be restricted during the modeling process. To solve this problem on the sentence level, the task of relation extraction in our project is modeled with two parts: sentence selector and relation extractor. Sentence selector, based on the theory of reinforcement learning, processes the corpus in units of entity pairs. The training corpus is divided into three parts including selected sentences, discarded sentences, and unlabeled sentences. We try to obtain more semantic information of the training corpus by introducing the intra-class attention and inter-class similarity. To make the operation of filtering noise data more accurate, this model evaluates the predicted value produced by the relation extractor between the selected and discarded sentences in the sentence package. The result shows that the redesigned reinforcement learning algorithm WPR-RL in this study can significantly improve the deficiencies of the existing approach. At the same time, we also carry out a number of composite tests to discuss the impact of each improvement on the performance of the model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []