Decidable XPath Fragments in the Real World

2018 
XPath is arguably the most popular query language for selecting elements in XML documents. Besides query evaluation, query satisfiability and containment are the main computational problems for XPath; they are useful, for instance, to detect dead code or validate query optimisations. These problems are undecidable in general, but several fragments have been identified over time for which satisfiability (or query containment) is decidable: CoreXPath 1.0 and 2.0 without so-called data joins, fragments with data joins but limited navigation, etc. However, these fragments are often given in a simplified syntax, and sometimes wrt. a simplified XPath semantics. Moreover, they have been studied mostly with theoretical motivations, with little consideration for the practically relevant features of XPath. To investigate the practical impact of these theoretical fragments, we design a benchmark compiling thousands of real-world XPath queries extracted from open-source projects. These queries are then matched against syntactic fragments from the literature. We investigate how to extend these fragments with seldom-considered features such as free variables, data tests, data joins, and the last() and id() functions, for which we provide both undecidability and decidability results. We analyse the coverage of the original and extended fragments, and further provide a glimpse at which other practically-motivated features might be worth investigating in the future.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    0
    Citations
    NaN
    KQI
    []