REINA at WebCLEF 2006 : Mixing fields to improve retrieval

2006 
his paper describes our work at CLEF 2006 Robust task. This task is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have realized experiments for all subtask: monolingual (EN, ES, FR and IT), bilingual (IT→ES) and multilingual (ES→[EN ES FR IT]) retrieval. For monolingual retrieval we have focused our work on local query expansion, i.e. using only the information from retrieved documents. External corpora, such as the Web, were not used. Our document retrieval system is simple; it is based on vector space model. Some local expansion techniques were applied for training topics. The best improvement was achieved using association thesauri, which were constructed employing co-occurrence relations in term windows, not in complete document. This technique is effective and can be easily implemented without tuning some parameters. Our mandatory runs (title+description topic fields) have obtained good positions in all monolingual subtasks we participate.For bilingual retrieval two machine translation programs were used to translate the topics from Italian into Spanish. Both translations were joined before searching. The same expansion technique was also applied. Our mandatory run has got the top rank in the bilingual subtask. For multilingual research we used the same procedure to obtain the retrieval list for each target language, and we combined them with the MAX-MIN data fusion method. In this subtask, our mandatory run has been in the lower part of the ranking of runs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    2
    Citations
    NaN
    KQI
    []