A Corpus for Hybrid Question Answering Systems

2018 
Question answering has been the focus of a lot of researches and evaluation campaigns, either for text-based systems (TREC and CLEF evaluation campaigns for example), or for knowledge-based systems (QALD, BioASQ). Few systems have effectively combined both types of resources and methods in order to exploit the fruitfulness of merging the two kinds of information repositories. The only evaluation QA track that focuses on hybrid QA is QALD since 2014. As it is a recent task, few annotated data are available (around 150 questions). In this paper, we present a question answering dataset that was constructed to develop and evaluate hybrid question answering systems. In order to create this corpus, we collected several textual corpora and augmented them with entities and relations of a knowledge base by retrieving paths in the knowledge base which allow to answer the questions. The resulting corpus contains 4300 question-answer pairs and 1600 have a true link with DBpedia.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    3
    Citations
    NaN
    KQI
    []