language-icon Old Web
English
Sign In

B-tree

In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children. Unlike other self-balancing binary search trees, the B-tree is well suited for storage systems that read and write relatively large blocks of data, such as discs. It is commonly used in databases and file systems. In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children. Unlike other self-balancing binary search trees, the B-tree is well suited for storage systems that read and write relatively large blocks of data, such as discs. It is commonly used in databases and file systems. What, if anything, the B stands for has never been established. In B-trees, internal (non-leaf) nodes can have a variable number of child nodes within some pre-defined range. When data is inserted or removed from a node, its number of child nodes changes. In order to maintain the pre-defined range, internal nodes may be joined or split. Because a range of child nodes is permitted, B-trees do not need re-balancing as frequently as other self-balancing search trees, but may waste some space, since nodes are not entirely full. The lower and upper bounds on the number of child nodes are typically fixed for a particular implementation. For example, in a 2-3 B-tree (often simply referred to as a 2-3 tree), each internal node may have only 2 or 3 child nodes. Each internal node of a B-tree contains a number of keys. The keys act as separation values which divide its subtrees. For example, if an internal node has 3 child nodes (or subtrees) then it must have 2 keys: a1 and a2. All values in the leftmost subtree will be less than a1, all values in the middle subtree will be between a1 and a2, and all values in the rightmost subtree will be greater than a2. Usually, the number of keys is chosen to vary between d {displaystyle d} and 2 d {displaystyle 2d} , where d {displaystyle d} is the minimum number of keys, and d + 1 {displaystyle d+1} is the minimum degree or branching factor of the tree. In practice, the keys take up the most space in a node. The factor of 2 will guarantee that nodes can be split or combined. If an internal node has 2 d {displaystyle 2d} keys, then adding a key to that node can be accomplished by splitting the hypothetical 2 d + 1 {displaystyle 2d+1} key node into two d {displaystyle d} key nodes and moving the key that would have been in the middle to the parent node. Each split node has the required minimum number of keys. Similarly, if an internal node and its neighbor each have d {displaystyle d} keys, then a key may be deleted from the internal node by combining it with its neighbor. Deleting the key would make the internal node have d − 1 {displaystyle d-1} keys; joining the neighbor would add d {displaystyle d} keys plus one more key brought down from the neighbor's parent. The result is an entirely full node of 2 d {displaystyle 2d} keys. The number of branches (or child nodes) from a node will be one more than the number of keys stored in the node. In a 2-3 B-tree, the internal nodes will store either one key (with two child nodes) or two keys (with three child nodes). A B-tree is sometimes described with the parameters ( d + 1 ) {displaystyle (d+1)} — ( 2 d + 1 ) {displaystyle (2d+1)} or simply with the highest branching order, ( 2 d + 1 ) {displaystyle (2d+1)} . A B-tree is kept balanced after insertion by splitting a would-be overfilled node, of 2 d + 1 {displaystyle 2d+1} keys, into two d {displaystyle d} -key siblings and inserting the mid-value key into the parent. Depth only increases when the root is split, maintaining balance. Similarly, a B-tree is kept balanced after deletion by merging or redistibuting keys among siblings to maintain the d {displaystyle d} -key minimum for non-root nodes. A merger reduces the number of keys in the parent potentially forcing it to merge or redistribute keys with its siblings, and so on. The only change in depth occurs when the root has two children, of d {displaystyle d} and (transitionally) d − 1 {displaystyle d-1} keys, in which case the two siblings and parent are merged, reducing the depth by one.

[ "Algorithm", "Database", "Data mining", "Programming language", "Tree (data structure)" ]
Parent Topic
Child Topic
    No Parent Topic