Efficient Evaluation of Regular Path Expressions on Streaming XML Data
2000
The adoption of XML promises to accelerate construction of systems that integrate distributed, heterogeneous data. Query languages for XML are typically based on regular path expressions that traverse the logical XML graph structure; the eAEcient evaluation of such path expressions is central to good query processing performance. Most existing XML query processing systems convert XML documents to an internal representation, generally a set of tables or objects; path expressions are evaluated using either index structures or join operations across the tables or objects. Unfortunately, the required index creation or join operations are often costly even with locally stored data, and they are especially expensive in the data integration domain, where the system reads data streamed from remote sources across a network, and seldom reuses results for subsequent queries. This paper presents the x-scan operator which eAEciently processes non-materialized XML data as it is being received by the data integration system. X-scan matches regular path expression patterns from the query, returning results in pipelined fashion as the data streams across the network. We experimentally demonstrate the bene ts of the x-scan operator versus the approaches used in current systems, and we analyze the algorithm's performance and scalability across a range of XML document types and queries.
Keywords:
- Correction
- Cite
- Save
- Machine Reading By IdeaReader
16
References
43
Citations
NaN
KQI