Utilisation de la théorie des graphes et de la distance d'édition pour la recherche d'information sur documents XML.

2011 
Information retrieval on semi-structured documents like XML (SIR) allows the user to narrow his search down to the document element level. Queries and semi-structured documents could be seen as hierarchically nested elements. We consider that their structural proximity could be evaluated over their trees similarity. Our SIR approach combines both content and structure scores, the latter being based on tree edit distance (minimal cost of operations to turn one tree to another). We then propagate and combine our score based on the neighbourhood of each of our nodes in the tree document structure. Our approach was evaluated over the SSCAS INEX’s 2005 task and our first results show the interest of such an approach. MOTS-CLES : Recherche d’information, graphes, XML, distance d’edition.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    2
    Citations
    NaN
    KQI
    []