Keyword search over probabilistic XML data

2015 
Despite the proliferation of work on XML keyword search, it remains open to support keyword search over uncertain XML data. In this paper, we study the problem of ELCA-based answers over uncertain XML data, which is to retrieve subtrees taking a probability of at least a threshold to be ELCA-based answers. To answer such query efficiently, we employ a filtering-and-verification strategy which is based on a proposed probabilistic inverted index, PrIndex. Based on PrIndex, we develop tight lower and upper bounds that can prune unqualified results very rapidly. After that, we propose an efficient algorithm (PrIndex-based algorithm) that combine probability threshold pruning and probability distribution of node from leaf to root to support keyword search over probabilistic XML data. Extensive experimental results demonstrate the effectiveness of the proposed algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    1
    Citations
    NaN
    KQI
    []