Approximate Nearest Neighbor Search and Lightweight Dense Vector Reranking in Multi-Stage Retrieval Architectures

2020 
In the context of a multi-stage retrieval architecture, we explore candidate generation based on approximate nearest neighbor (ANN) search and lightweight reranking based on dense vector representations. These results serve as input to slower but more accurate rerankers such as those based on transformers. Our goal is to characterize the effectiveness-efficiency tradeoff space in this context. We find that, on sentence-length segments of text, ANN techniques coupled with dense vector reranking dominate approaches based on inverted indexes, and thus our proposed design should be preferred. For paragraph-length segments, ANN-based and index-based techniques share the Pareto frontier, which means that the choice of alternatives depends on the desired operating point.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    8
    Citations
    NaN
    KQI
    []