Learning to Adaptively Rank Document Retrieval System Configurations

Romain Deveaud,Josiane Mothe,Zia Ullah,Jian-Yun Nie

Learning to Adaptively Rank Document Retrieval System Configurations

2019

Modern Information Retrieval (IR) systems have become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion parameters, whose values greatly influence the overall retrieval effectiveness. Traditionally, these parameters are set at a system level based on training queries, and the same parameters are then used for different queries. We observe that it may not be easy to set all these parameters separately, since they can be dependent. In addition, a global setting for all queries may not best fit all individual queries with different characteristics. The parameters should be set according to these characteristics. In this article, we propose a novel approach to tackle this problem by dealing with the entire system configurations (i.e., a set of parameters representing an IR system behaviour) instead of selecting a single parameter at a time. The selection of the best configuration is cast as a problem of ranking different possible configurations given a query. We apply learning-to-rank approaches for this task. We exploit both the query features and the system configuration features in the learning-to-rank method so that the selection of configuration is query dependent. The experiments we conducted on four TREC ad hoc collections show that this approach can significantly outperform the traditional method to tune system configuration globally (i.e., grid search) and leads to higher effectiveness than the top performing systems of the TREC tracks. We also perform an ablation analysis on the impact of different features on the model learning capability and show that query expansion features are among the most important for adaptive systems.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations