An Automated Domain Understanding Technique for Knowledge Graph Generation

2021 
Domain-specific Knowledge Graph (KG) generation is a labor intensive task usually orchestrated and supervised by subject matter experts. Herein, we propose a strategy to automate the generation process following a two-step approach. Initially, the structure of the domain of interest is inferred from the corpus in the form of a metagraph. Afterwards, once the domain structure has been discovered, named entity recognition (NER) and relation extraction (RE) models can be used to generate a domain-specific KG. We argue why the automated definition of the domain's structure as a first step is beneficial both in terms of construction time and quality of the generated graph. Furthermore, we present a machine learning approach, based on Transformers, to infer the structure of a corpus's domain. The proposed method is extensively validated on three public datasets (WebNLG, NYT and DocRED) by comparing it with two reference methods using CNNs and RNNs. Lastly, we demonstrate how this work lays the foundation for fully automated and unsupervised KG generation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []