Detection of Entity Mixture in Knowledge Bases Using Hierarchical Clustering

Haihua Xie,Xiaoqing Lu,Zhi Tang,Xiaojun Huang

Detection of Entity Mixture in Knowledge Bases Using Hierarchical Clustering

2016

Entity mixture in a knowledge base refers to the situation that some attributes of an entity are mistaken for another entity’s, and it often occurs among homonymous entities which have the same value of the attribute “Name”. Elimination of entity mixture is critical to ensure data accuracy and validity for knowledge based services. However, current researches on entity disambiguation mainly focuses on determining the identity of entities mentioned in text during information extraction for building a knowledge base, while little work has been done to verify the information in a built knowledge base. In this paper, we propose a generic method to detect mixed homonymous entities in a knowledge base using hierarchical clustering. The principle of our methodology to differentiate entities is detecting the inconsistence of their attributes based on analysis of the appearance distribution of their attribute values in documents of a common corpus. Experiments on a data set of industry applications have been conducted to demonstrate the workflow of performing the clustering and detecting mixed entities in a knowledge base using our methodology.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations