Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

Raphael Hoffmann,Congle Zhang,Xiao Ling,Luke S. Zettlemoyer,Daniel S. Weld

Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

2011

Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web's natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multi-instance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint --- for example they cannot extract the pair Founded(Jobs, Apple) and CEO-of(Jobs, Apple). This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Free-base. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

857

Citations