Learning soft domain constraints in a factor graph model for template-based information extraction

2019 
Abstract The ability to accurately extract key information from textual documents is necessary in several downstream applications e.g., automatic knowledge base population from text, semantic information retrieval, question answering, or text summarization. However, information extraction (IE) systems are far from being errorless and in some cases commit errors that seem obvious to a human expert as they violate common sense or domain knowledge. Towards improving the performance of IE systems, we focus on the question of how domain knowledge can be incorporated into IE models to reduce the number of spurious extractions. Starting from the assumption that such domain knowledge cannot be incorporated explicitly and manually by domain experts due to the amount of effort and technical complexities involved, we propose a machine learning approach in which domain constraints are acquired as a byproduct of learning a model that learns to extract key information in a supervised setting. We frame the task as a template-based information extraction problem in which several dependent slots need to be automatically filled and propose a factor graph based approach to model the joint distribution of slot assignments given a text. Beyond using standard textual features in factors that score the compatibility of slot fillers in relation to the text, we use additional features that are text-independent and capture soft domain constraints. During the training process, these constraints receive a weight as part of the parameter learning process indicating how strongly a constraint should be enforced. These domain constraints are thus ‘soft’ in the sense that they can be violated, but the system learns to penalize solutions that violate them. The soft constraints we introduce come in two flavors: on the one hand we incorporate information about the mean of numerical attributes and use features that indicate how far a certain value is from the mean. We call these features single slot soft constraints. On the other hand, we model the pairwise compatibility between slot filler assignments independent of the textual context, thus modelling the (domain) compatibility of the slot assignments. We call the latter ones pairwise slot soft constraints. As main result of our work, we show that learning pairwise slot soft constraints improves the performance of our extraction model compared to single slot soft constraints by up to 6 points in F 1 , leading to an F 1 score of 0.91 for individual template types. Further, the human readable output format of our model enables the extraction and interpretation of the learned soft constraints. Based on this, we show in an evaluation by domain experts that more than 68% of the learned soft constraints are regarded as plausible.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    1
    Citations
    NaN
    KQI
    []