B Trees
B Trees
Eduardo Laber
David Sotelo
What are B-trees?
• Balanced search trees designed for secondary
storage devices
4 5
1 2 3 5 6 7 8 1 2 3 4 6 7 8
3 6
1 2 4 5 7 8
The height of a B-tree
Theorem : Let h be the height of a B-tree of n
keys and order B > 1. Then: h ≤ log B (n+1)/2
Proof:
• Root contains at least one key.
• All other nodes contain at least B keys
• At least one key at depth 0
• At least 2B keys at depth 1
• At least 2B2 + B keys at depth 2
• At least 2Bi + Bi-1 + Bi-2 + ... + B keys at depth i
Proof (continued)
h
n 1 2 B i
i 1
h 1
B 1
n 1 2
B 1
h n 1 h
n 2 B 1 B
2
n 1
h log B ■
2
Searching a B-tree
• Similar to searching a binary search tree.
C G K P S
A B D E F H I L M N O Q R T U
Searching a B-tree
• Search for the key F
C G K P S
A B D E F H I L M N O Q R T U
Searching a B-tree
• Search for the key F
C G K P S
A B D E F H I L M N O Q R T U
Searching a B-tree
• Search for the key N
C G K P S
A B D E F H I L M N O Q R T U
Searching a B-tree
Lemma: The time complexity of procedure
B-TREE-SEARCH is O(B log B n)
Proof:
• Number of recursive calls is equal to tree’s
height.
• The height of a B-tree is O(log B n)
• Cost between B and 2B iterations per call.
• Total of O(B log B n) steps. ■
Exercise 2
• Suppose that B-TREE-SEARCH is implemented
to use binary search rather than linear search
within each node.
Proof: lg n B
B
B log B n log B n
lg B
B
lg n B
n B B n B / B
B
B
lg n B lg n / B lg n / n B
lg n B
/ n lg n lg n
B 1
Inserting a key into a B-tree
• The new key is always inserted into an existing leaf node (why?)
• Firstly we search for the leaf position at which to insert the new
key.
• A split operation splits a full node around its median key into
two nodes having B keys each.
A C E G K M O Q
Split operation
• Node found but already full
A C E FG K M O Q
Split operation
• Median key identified
A C E FG K M O Q
Split operation
• Splitting the node
E J
A C F G K M O Q
Inserting a key into a B-tree
• Insertion can be propagated upward (B = 2)
E J T X
A C F G K M O Q UW Y Z
Inserting a key into a B-tree
• Insertion can be propagated upward (B = 2)
E J T X
A C F G K MNO Q UW Y Z
Inserting a key into a B-tree
• Insertion can be propagated upward (B = 2)
E J N T X
A C F G K M O Q UW Y Z
SPLIT
Inserting a key into a B-tree
• Insertion can be propagated upward (B = 2)
E J SPLIT T X
A C F G K M O Q U W Y Z
Inserting a key into a B-tree
B-TREE-INSERT (x, k, y)
1 i=1
2 while i ≤ x.n and k < x.keyi do i = i + 1
3 x.n = x.n + 1
4 x.keyi = k
5 x.pi+1 = y
6 for j = x.n downto i+1 do
7 x.keyj = x.keyj-1
8 x.pj = x.pj-1
9 end-for
10 DISK-WRITE(x)
Inserting a key into a B-tree
B-TREE-INSERT (x, k)
11 if x.n > 2*B then
12 [m, z] = SPLIT (x)
13 if x.parent != NIL then
14 DISK-READ (x.parent)
15 end-if
16 else
17 x.parent = ALLOCATE-NODE()
18 DISK-WRITE (x)
19 root = x.parent
20 end-else
21 B-TREE-INSERT (x.parent, m, z)
22 end-if
Inserting a key into a B-tree
SPLIT (x)
1 z = ALLOCATE-NODE()
2 m = FIND-MEDIAN (x)
3 COPY-GREATER-ELEMENTS(x, m, z)
4 DISK-WRITE (z)
5 COPY-SMALLER-ELEMENTS(x, m, x)
6 DISK-WRITE (x)
7 return [m, z]
Inserting a key into a B-tree
• Function B-TREE-INSERT has three arguments:
– The node x at which an element of key k should be inserted
– The key k to be inserted
– A pointer y to the left child of k to be used as one of the
pointers of x during insertion process.
• At most one node is visited per level/depth and only visited nodes
can be splitted. A most one node is created during the insertion
process. Cost for splitting is proportional to 2B.
B G I
A C D F H J
Exercise 4
• A key can be deleted from any node (not just a leaf) and
can affect its parent and its children (insertion operation
just affect parents).
Procedure:
Just remove the key from the node.
Deleting a key from a B-tree
• Case 1: The key is in a leaf node with more
than B elements (B = 2)
E J T X
A C D F G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 1: The key is in a leaf node with more
than B elements (B = 2)
E J T X
A D F G K M O Q U W Y Z
Deleting a key from a B-tree
Case 2: The join procedure
• The key k1 to be deleted is in a leaf x with exactly B elements.
• Let y be a node that is an “adjacent brother” of x.
• Suppose that y has exactly B elements.
Procedure:
Remove the key k1.
Let k2 be the key that separates nodes x and y in their parent.
Join the the nodes x and y and move the key k2 from the parent
to the new joined node.
If the parent of x becomes with B-1 elements and also has an
“adjacent brother” with B elements, apply the join procedure
recursively for the parent of x (seen as x) and its adjacent
brother (seen as y).
Deleting a key from a B-tree
• Case 2: Delete key Q (B = 2)
K T X
...
H I O Q U W Y Z
Deleting a key from a B-tree
• Case 2: Delete key Q (B = 2)
F
Parent
K T X
...
H I O Q U W Y Z
Node x Node y
Deleting a key from a B-tree
• Case 2: Delete key Q (B = 2)
F
Parent
K T X
...
H I O U W Y Z
Node x Node y
Deleting a key from a B-tree
• Case 2: Delete key Q (B = 2)
F
Parent
K T X
...
H I O U W Y Z
Node x Node y
Deleting a key from a B-tree
• Case 2: Delete key Q (B = 2)
F
Parent
K X
...
H I O T U W Y Z
Join
Deleting a key from a B-tree
• Case 2: Delete key Q (B = 2)
K X
...
H I O T U W Y Z
Deleting a key from a B-tree
Case 3: join and split
• The key k1 to be deleted is in a leaf x with exactly B elements.
• Let y be a node that is an “adjacent brother” of x.
• Suppose that y has more than B elements.
Procedure:
Remove the key k1.
Let k2 be the key that separates nodes x and y in their parent.
Join the the nodes x and y and move the key k2 from the parent
to the new joined node z.
Find the median key m of z
Determine the new nodes x and y by splitting z around m.
Insert m into the parent of x and y.
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
E J T X
A C D F G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
E J T X
A C D F G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
E J T X
A C D G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
N
Parent
E J T X
A C D G K M O Q U W Y Z
Node y Node x
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
N
Parent
E J T X
A C D G K M O Q U W Y Z
Node y Node x
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
J T X
Join
A C D E G
K M O Q U W Y Z
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
J T X
Median key
A C D E G
K M O Q U W Y Z
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
D J T X
Split
A C E G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 3: Delete key F (B = 2)
D J T X
A C E G K M O Q U W Y Z
Deleting a key from a B-tree
Case 4: internal node
• The key k1 to be deleted is in a node x that is not a leaf or
a root.
Procedure:
Let k2 be the smallest key that is greater than k1.
Let y be the node of k2, which will be a leaf.
Insert key k2 into x.
Remove the key k1 from x.
Solve now the problem of removing k2 from a leaf y,
previously considered.
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
D J T X
A C E G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J T X
A C E G K M O Q U W Y Z
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J T X
A C E G K M O Q U W Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U X
A C E G K M O Q W Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U X
A C E G K M O Q W Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U X
A C E G K M O Q W Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U X
A C E G K M O Q W Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U
A C E G K M O Q W X Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U
A C E G K M O Q W X Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U
A C E G K M O Q W X Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U
A C E G K M O Q W X Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
N
Node x
D J U
A C E G K M O Q W X Y Z
Node y
Deleting a key from a B-tree
• Case 4: Delete key T (B = 2)
D J N U
A C E G K M O Q W X Y Z