Question Answering When Knowledge Bases are Incomplete

2020 
While systems for question answering over knowledge bases (KB) continue to progress, real world usage requires systems that are robust to incomplete KBs. Dependence on the closed world assumption is highly problematic, as in many practical cases the information is constantly evolving and KBs cannot keep up. In this paper we formalize a typology of missing information in knowledge bases, and present a dataset based on the Spider KB question answering dataset, where we deliberately remove information from several knowledge bases, in this case implemented as relational databases (The dataset and the code to reproduce experiments are available at https://github.com/camillepradel/IDK.). Our dataset, called IDK (Incomplete Data in Knowledge base question answering), allows to perform studies on how to detect and recover from such cases. The analysis shows that simple baselines fail to detect most of the unanswerable questions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    2
    Citations
    NaN
    KQI
    []