Storage and Retrieval of XML Documents Without Redundant Path Information

Lee Hiye-Ja,Jeong Byeong-Soo,Kim Dae Ho,Lee Young-Koo

Storage and Retrieval of XML Documents Without Redundant Path Information

2005

This Paper Proposes an approach that removes the redundancy of Path information and uses an inverted index, as an efficient way to store a large volume of XML documents and to retrieve wanted information from there. An XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. The existing methods using path information store data for all element paths, which cause retrieval performance to be decreased with increased data volume. Our approach stores only data for leaf element path excluding internal element paths. As the inverted index is made by the leaf element path only, the number of posting lists by key words become smaller than those of the existing methods. For the storage and retrieval of U data, our approach doesn`t require the XML schema information of XML documents and any extension of relational database. We demonstrate the better performance of on approach than the existing approaches within the scope of our experiment.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations