Positional Translation Language Model for Ad-Hoc Information Retrieval

2014 
Most existing language modeling approaches are based on the term independence hypothesis. To go beyond this assumption, two main directions were investigated. The first one considers the use of the proximity features that capture the degree to which search terms appear close to each other in a document. Another one considers the use of semantic relationships between words. Previous studies have proven that these two types of information, including term proximity features and semantic relationships between words, are both useful to improve retrieval performance. Intuitionally, we can use them in combination to further improve retrieval performance. Based on this idea, this paper propose a positional translation language model to explicitly incorporate both of these two types of information under language modeling framework in a unified way. In the first step, we present a proximity-based method to estimate word-word translation probabilities. Then, we define a translation document model for each position of a document and use these document models to score the document. Experimental results on standard TREC collections show that the proposed model achieves significant improvements over the state-of-the-art models, including positional language model, and translation language models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []