IMBBTC: XML Document Indexing Model Based on Binary Tree Coding

Zhixin Hu

IMBBTC: XML Document Indexing Model Based on Binary Tree Coding

2015

Zhixin Hu

In order to facilitate decision relation of nodes, support dynamic updates and improves the speed for XML data query, etc, this paper proposes a XML document indexing structure model based on binary tree encoding. The XML document tree uses trigeminal linked list of binary tree structure to encode nodes. The indexing model of binary sort tree was established, which uses elements of the leaf node as indexing terms, and combines with semantic information of nodes. This paper gives some corresponding algorithms, implements the prototype system of indexing model and its corresponding simulation experiments. Theoretical analysis and experimental results show that the indexing model not only supports update operation of nodes and facilitates decision relation of nodes, but also has advantage of short query time. Introduction Establishing the appropriate indexing structure combine with coding technology is one of the commonly used means [1-3]. In order to support and accelerate query processing, people put forward many coding scheme and design adaptive indexing structure. There are three main types of XML indexing: value, path and sequence indexing at present, such as a (k), Data Guide, XISS [4], and so on. It thinks elements and attributes as basic unit of query in XISS indexing, which idea is that separate the complicated paths into the simple paths, and needs join the simple paths. But the process mode is a time-consuming process. The SUPEX builds up indexing by using DTD information. A combined indexing is proposed, which has characteristics of small space overhead and high query efficiency, and can realize hybrid query based on content and text [5]. The above indexing structure is generally based on region code, prefix code and Dewey code. Researches of indexing based on binary tree code are few at present. From this perspective, this paper designs indexing of XML document based on binary tree coding [6], which takes binary tree as XML tree node code by using method of literature. In order to realize need of the XML document optimized query, an indexing of binary sort tree is established by taking leaf nodes as keywords in a particular order, which takes binary tree coding of trigeminal chain as nodes coding of XML tree, combine the document node semantics with characteristics of XML document, which can achieve keywords approximate query. XML document Coding Schema Based on Binary Tree Traversal Construction of XML Document Nodes Coding Firstly, XML document shall be transformed into XML document tree, and then the XML document shall be transformed into binary tree. Each node of the XML document tree are to be numbered by n (n∈natural number, n represents the position information of node in XML tree) in the order top-to-down or down-to-top. Each node are stored by trigeminal linked list structure, namely, each node has left-child point, right-child point and parents point (in order to judge conveniently relationship between nodes and nodes). Construction of node code is achieved by two steps: Step1: XML document shall be transformed into XML document tree (Using the left child – right brother method). Step2: The binary tree with tree trigeminal: nodes of inverted the binary tree is by stored with the trigeminal chain, namely, each node of the binary tree includes three pointer fields: left child field, right field and parents pointer fields. International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2015) © 2015. The authors Published by Atlantis Press 1838 Decision of nodes relation Property 1. (Decision of father and son / brother): Let vi , v j and vk be any node in the binary tree, if v v lchild i k = −> and v v lchild j k = −> , then and are brothers, i v and k v are father-son or j v and k v are father-son. Property 2. (Decision of the grandfather and grandson): Let vi , v j and vk be any node in the binary tree, if v parent v k j − > = and v parent v j i − > = , then i and k are the grandfather and grandson. Indexing Structure Model based on Binary Tree Coding Thinking of Indexing Design The element or node information is presented in XML document tree leaf nodes in accordance with characteristic of the XML document and XML document tree. Starting from the root node, it does not query along with a path until the leaf node. From this perspective, this paper constructs indexing structure table of XML document by using indexing technology, based on element leaf node for indexing key item, integrates with semantic information of nodes, and uses algorithm of binary sort trees. The model uses DFS (Depth First Search) thinking to travel XML document. Due to combining the nodes semantic information of the XML document and characteristic of binary sort tree, only some relevant leaf nodes need be traveled and visited. Therefore, by traveling some relevant node, indexing table of binary sort tree can be constructed. Taking the XML document of the Fig.1 for example, corresponding document tree is shown in Fig.2. Fig.1: DBLP.XML Document

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations