Fusion tree

In computer science, a fusion tree is a type of tree data structure that implements an associative array on w-bit integers. When operating on a collection of n key–value pairs, it uses O(n) space and performs searches in O(logw n) time, which is asymptotically faster than a traditional self-balancing binary search tree, and also better than the van Emde Boas tree for large values of w. It achieves this speed by exploiting certain constant-time operations that can be done on a machine word. Fusion trees were invented in 1990 by Michael Fredman and Dan Willard. In computer science, a fusion tree is a type of tree data structure that implements an associative array on w-bit integers. When operating on a collection of n key–value pairs, it uses O(n) space and performs searches in O(logw n) time, which is asymptotically faster than a traditional self-balancing binary search tree, and also better than the van Emde Boas tree for large values of w. It achieves this speed by exploiting certain constant-time operations that can be done on a machine word. Fusion trees were invented in 1990 by Michael Fredman and Dan Willard. Several advances have been made since Fredman and Willard's original 1990 paper. In 1999 it was shown how to implement fusion trees under a model of computation in which all of the underlying operations of the algorithm belong to AC0, a model of circuit complexity that allows addition and bitwise Boolean operations but disallows the multiplication operations used in the original fusion tree algorithm. A dynamic version of fusion trees using hash tables was proposed in 1996 which matched the original structure's O(logw n) runtime in expectation. Another dynamic version using exponential tree was proposed in 2007 which yields worst-case runtimes of O(logw n + log log n) per operation. It remains open whether dynamic fusion trees can achieve O(logw n) per operation with high probability. A fusion tree is essentially a B-tree with branching factor of w1/5 (any small exponent is also possible), which gives it a height of O(logw n). To achieve the desired runtimes for updates and queries, the fusion tree must be able to search a node containing up to w1/5 keys in constant time. This is done by compressing ('sketching') the keys so that all can fit into one machine word, which in turn allows comparisons to be done in parallel. Sketching is the method by which each w-bit key at a node containing k keys is compressed into only k − 1 bits. Each key x may be thought of as a path in the full binary tree of height w starting at the root and ending at the leaf corresponding to x. To distinguish two paths, it suffices to look at their branching point (the first bit where the two keys differ). All k paths together have k − 1 branching points, so at most k − 1 bits are needed to distinguish any two of the k keys. An important property of the sketch function is that it preserves the order of the keys. That is, sketch(x) < sketch(y) for any two keys x < y. If the locations of the sketch bits are b1 < b2 < ··· < br, then the sketch of the key xw-1···x1x0 is the r-bit integer x b r x b r − 1 ⋯ x b 1 {displaystyle x_{b_{r}}x_{b_{r-1}}cdots x_{b_{1}}} . With only standard word operations, such as those of the C programming language, it is difficult to directly compute the sketch of a key in constant time. Instead, the sketch bits can be packed into a range of size at most r4, using bitwise AND and multiplication. The bitwise AND operation serves to clear all non-sketch bits from the key, while the multiplication shifts the sketch bits into a small range. Like the 'perfect' sketch, the approximate sketch preserves the order of the keys. Some preprocessing is needed to determine the correct multiplication constant. Each sketch bit in location bi will get shifted to bi + mi via a multiplication by m = ∑ i = 1 r {displaystyle extstyle sum _{i=1}^{r}} 2mi. For the approximate sketch to work, the following three properties must hold: An inductive argument shows how the mi can be constructed. Let m1 = w − b1. Suppose that 1 < t ≤ r and that m1, m2... mt-1 have already been chosen. Then pick the smallest integer mt such that both properties (1) and (2) are satisfied. Property (1) requires that mt ≠ bi − bj + ml for all 1 ≤ i, j ≤ r and 1 ≤ l ≤ t-1. Thus, there are less than tr2 ≤ r3 values that mt must avoid. Since mt is chosen to be minimal, (bt + mt) ≤ (bt-1 + mt-1) + r3. This implies Property (3).

Parent Topic

Child Topic

No Parent Topic