Query Expansion Based on WordNet and Word2vec for Italian Question Answering Systems
2017
Recently, Question Answering (QA) systems have emerged as efficient solutions for helping users find proper answers to questions pertaining to a specific situation. One of the major modern paradigms for QA is based on Information Retrieval (IR) techniques, where the text of a user question is evaluated in order to extract a collection of relevant keywords, formulate queries on the top of them for a search engine and extract candidate answers from documents matching with the queries. Nevertheless, in the case of semantically complex and rich languages, like Italian, many concepts can be expressed in a variety of distinct linguistic forms. This problem particularly arises when QA is applied to smaller sets of documents pertaining to a closed domain, where an answer might appear only once, and its exact wording might differ partially or completely from the one used in the query. To solve this issue, this paper proposes a hybrid approach of Query Expansion (QE) where lexical resources and word embeddings (WEs) are combined to generate synonyms and hypernyms of relevant words extracted from the user question and contextualize this set with respect to the corpus of interest and with respect to the peculiar question. An experimental session has been arranged in order to compare the proposed QE approach with other different techniques and evaluate its impact of with respect to the accuracy of a QA system in extracting proper answers to factoid questions from documents pertaining to the Cultural Heritage domain. The experiments showed the effectiveness of the proposed solution with respect to three different evaluation metrics typically used in literature.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
27
References
4
Citations
NaN
KQI