Fractal tree index

In computer science, a fractal tree index is a tree data structure that keeps data sorted and allows searches and sequential access in the same time as a B-tree but with insertions and deletions that are asymptotically faster than a B-tree. Like a B-tree, a fractal tree index is a generalization of a binary search tree in that a node can have more than two children. Furthermore, unlike a B-tree, a fractal tree index has buffers at each node, which allow insertions, deletions and other changes to be stored in intermediate locations. The goal of the buffers is to schedule disk writes so that each write performs a large amount of useful work, thereby avoiding the worst-case performance of B-trees, in which each disk write may change a small amount of data on disk. Like a B-tree, fractal tree indexes are optimized for systems that read and write large blocks of data. The fractal tree index has been commercialized in databases by Tokutek. Originally, it was implemented as a cache-oblivious lookahead array, but the current implementation is an extension of the Bε tree. The Bε is related to the Buffered Repository Tree. The Buffered Repository Tree has degree 2, whereas the Bε tree has degree Bε. The fractal tree index has also been used in a prototype filesystem. An open source implementation of the fractal tree index is available, which demonstrates the implementation details outlined below. In computer science, a fractal tree index is a tree data structure that keeps data sorted and allows searches and sequential access in the same time as a B-tree but with insertions and deletions that are asymptotically faster than a B-tree. Like a B-tree, a fractal tree index is a generalization of a binary search tree in that a node can have more than two children. Furthermore, unlike a B-tree, a fractal tree index has buffers at each node, which allow insertions, deletions and other changes to be stored in intermediate locations. The goal of the buffers is to schedule disk writes so that each write performs a large amount of useful work, thereby avoiding the worst-case performance of B-trees, in which each disk write may change a small amount of data on disk. Like a B-tree, fractal tree indexes are optimized for systems that read and write large blocks of data. The fractal tree index has been commercialized in databases by Tokutek. Originally, it was implemented as a cache-oblivious lookahead array, but the current implementation is an extension of the Bε tree. The Bε is related to the Buffered Repository Tree. The Buffered Repository Tree has degree 2, whereas the Bε tree has degree Bε. The fractal tree index has also been used in a prototype filesystem. An open source implementation of the fractal tree index is available, which demonstrates the implementation details outlined below. In fractal tree indexes, internal (non-leaf) nodes can have a variable number of child nodes within some pre-defined range. When data is inserted or removed from a node, its number of child nodes changes. In order to maintain the pre-defined range, internal nodes may be joined or split. Each internal node of a B-tree will contain a number of keys that is one less than its branching factor. The keys act as separation values which divide its subtrees. Keys in subtrees are stored in search tree order, that is, all keys in a subtree are between the two bracketing values. In this regard, they are just like B-trees. Fractal tree indexes and B-trees both exploit the fact that when a node is fetched from storage, a block of memory, whose size is denoted by B {displaystyle B} , is fetched. Thus, nodes are tuned to be of size approximately B {displaystyle B} . Since access to storage can dominate the running time of a data structure, the time-complexity of external memory algorithms is dominated by the number of read/writes a data structure induces. (See, e.g., for the following analyses.) In a B-tree, this means that the number of keys in a node is targeted to be enough to fill the node, with some variability for node splits and merges. For the purposes of theoretical analysis, if O ( B ) {displaystyle O(B)} keys fit in a node, then the tree has depth O ( log B ⁡ N ) {displaystyle O(log _{B}N)} , and this is the I/O complexity of both searches and insertions. Fractal trees nodes use a smaller branching factor, say, of B {displaystyle {sqrt {B}}} . The depth of the tree is then O ( log B ⁡ N ) = O ( log B ⁡ N ) {displaystyle O(log _{sqrt {B}}N)=O(log _{B}N)} , thereby matching the B-tree asymptotically. The remaining space in each node is used to buffer insertions, deletion and updates, which we refer to in aggregate as messages. When a buffer is full, it is flushed to the children in bulk. There are several choices for how the buffers are flushed, all leading to similar I/O complexity. Each message in a node buffer will be flushed to a particular child, as determined by its key. Suppose, for concreteness, that messages are flushed that are heading to the same child, and that among the B + 1 {displaystyle {sqrt {B}}+1} children, we pick the one with the most messages. Then there are at least B B + 1 ≈ B {displaystyle {frac {B}{{sqrt {B}}+1}}approx {sqrt {B}}} messages that can be flushed to the child. Each flush requires O ( 1 ) {displaystyle O(1)} flushes, and therefore the per-message cost of a flush is O ( 1 B ) {displaystyle Oleft({frac {1}{sqrt {B}}} ight)} . Consider the cost of an insertion. Each message gets flushed O ( log B ⁡ N ) {displaystyle O(log _{B}N)} times, and the cost of a flush is O ( 1 B ) {displaystyle Oleft({frac {1}{sqrt {B}}} ight)} . Therefore, the cost of an insertion is O ( log B ⁡ N B ) {displaystyle Oleft({frac {log _{B}N}{sqrt {B}}} ight)} . Finally, note that the branching factor can vary, but for any branching factor B ε {displaystyle B^{varepsilon }} , the cost of a flush is O ( 1 B 1 − ε ) {displaystyle Oleft({frac {1}{B^{1-varepsilon }}} ight)} , thereby providing a smooth tradeoff between search cost, which depends on the depth of the search tree, and therefore the branching factor, versus the insertion time, which depends on the depth of the tree but more sensitively on the size of the buffer flushes.

Parent Topic

Child Topic

No Parent Topic