Entity Disambiguation with Context Awareness in User-Generated Short Texts

2020 
Abstract Conceptualization is to obtain the most appropriate concepts for noun terms (entities) under different contexts, which plays an important role in human knowledge understanding. However, in natural language, entities are often ambiguous, which creates difficulties in conceptualization. To accurately conceptualize, we must eliminate the ambiguity of entities. Existing methods mainly rely on similar or related entities in context for disambiguation. However, due to the sparsity of user-generated short texts, the number of entities that can be extracted from them is limited. In this paper, we propose an entity disambiguation method, which consists of three steps. 1) Measuring the correlation between terms, which uses both corpus and knowledge information to capture the specific semantic relationship. 2) Selecting informative terms, which considers various types of contextual terms, not just entities, thereby mitigating the effects of text sparsity. 3) Prioritizing informative terms to highlight their discriminative power, which reduces noise interference. Finally, the target entity is disambiguated based on informative terms. Experimental results on ground-truth datasets demonstrate that the proposed method outperforms baseline methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    1
    Citations
    NaN
    KQI
    []