B Trees and B+ Trees Data structures in Computer Sciences

171 views 39 slides Nov 05, 2024
Slide 1
Slide 1 of 39
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39

About This Presentation

B-Trees and B+ Trees are some of the data structures that are used to store data on disk storage. They allow for faster and efficient operations compared to BST and Binary Trees. In this Presentation we will discuss about operations, time complexity and some examples and exercises.


Slide Content

B TreeandB+ Tree
PresentedBy: Tanmay Kataria
(2310994841)
SIT 221 – Data Structures and Algorithms
Deakin University Chitkara University

Try to Remember Binary Tree…
Binary Tree
Binary Tree can only have two child’s at most
Each child can have only 1 parent.
Lets try to solve the problem below:
Consider this binary search tree, and imagine if
this tree grows to a million records.
Try to figure out height and time
complexity of the tree

B Tree
To coverup the limitations of Binary tree we use B Tree.
It is a data structure that can handle massive amounts of data easily.
Each node in a B-Tree can contain multiple keys.
It allows the tree to have a larger branching factor and a shallower height.
B-Trees are particularly well suited for storage systems that have slow, bulky data access
such as hard drives, flash memory, flash drives and CD-ROMs.

Moving Back…
Going back to the example for
Binary Search tree.
We can make the new tree with only 4
childs and 3 keys.
Using B Tree we can reduce the height of
tree easily.

Properties of B Tree
All leaves are at the same level.
B-Tree is defined by the term minimum degree ‘t‘. The value of ‘t‘ depends upon disk
block size.
Every node except the root must contain at least t-1 keys. The root may contain a
minimum of 1 key.
All nodes (including root) may contain at most (2*t – 1) keys.
Number of children of a node is equal to the number of keys in it plus 1.

More Properties...
All keys of a node are sorted in increasing order. The child between two keys k1 and k2
contains all keys in the range from k1 and k2.
B-Tree grows and shrinks from the root which is unlike Binary Search Tree. Binary Search
Trees grow downward and also shrink from downward.

Always Remember!
While Creating a B Tree of order ‘k’( k should always be odd! ), remember:
All leaves are on the same level.
All non-leaf nodes can have at least ‘k/2’ children. (Except the root)
A leaf node can not contain more than ‘k-1’ keys.
The root should be either
A leaf node ,or
Should have ‘2’ to ‘k’ children
The number of keys in each non-leaf-node should be one less than the number of its
children and these keys partition the keys in the children

Searching In B Tree
Searching a B-Tree is similar to searching a binary tree.
The algorithm is similar and goes with recursion.
At each level, the search is optimized as if the key value is not present in the range of the
parent then the key is present in another branch.
As these values limit the search they are also known as limiting values or separation values.
If we reach a leaf node and don’t find the desired key then it will display NULL.

Searching Operation
Or we can say,
Let the key to be searched is k.
Start from the root and recursively traverse down.
For every visited non-leaf node,
If the node has the key, we simply return the node.
Otherwise, we recur down to the appropriate child (The child which is just before the first
greater key) of the node.
If we reach a leaf node and don’t find k in the leaf node, then return NULL.

Applications of B-Trees
It is used in large databases to access data stored on the disk
Searching for data in a data set can be achieved in significantly less time using the B-Tree
With the indexing feature, multilevel indexing can be achieved.

More Applications
Most of the servers also use the B-tree approach.
B-Trees are used in CAD systems to organize and search geometric data.
B-Trees are also used in other areas such as natural language processing, computer
networks, and cryptography.

Advantages of B-Trees
 B-Trees have a guaranteed time complexity of O(log n) for basic operations like insertion,
deletion, and searching, which makes them suitable for large data sets and real-time
applications.
 B-Trees are self-balancing.
High-concurrency and high-throughput.
Efficient storage utilization.

Disadvantages
B-Trees are based on disk-based data structures and can have a high disk usage.
Not the best for all cases.
Slow in comparison to other data structures.

Time Complexity
Operation Time Complexity
Searching O(log n)
Insertion O(log n)
Deletion O(log n)

Insertion
A new key is always inserted at leaf node.
Like BST, we start from the top and traverse to leaf nodes.
Let the key to be inserted be k.
If required, Traverse down to the required node.
Before inserting we check if the there is space for the key to be inserted.
Unlike BST, here we have a predefined range on number of keys a node can contain.
If there is space then add the key.
Otherwise, we split it to create space for the key while maintaining the properties of the B
Tree. (the key used to create a more space, could be called as partitioning key)

Exercise – Creating a B Tree
 with Key = { 1 , 12 , 8 , 2 , 25 , 6 , 14 , 28 , 17 , 7 , 52 , 16 , 48 , 68 } , and
Order of B Tree = 5
Step 1: Add 1 , 12, 8 , 2 , 25 , and mark 8 as the portioning key.
Step 2: Add 6 , 14 , 28 , 17 , 7, as the children according to the range. And mark 17 as next
partitioning key.

Step 3: Continue adding 5 values and repeating the above steps,
until you get this tree
Note: Remember to perform sorting every time each element is added and after sorting
the partitioning key is selected.

Deletion
Find the key from the tree, it can be anywhere and could be deleted from anywhere.
Start from the top and search downwards until you find the key.
If the key is in a leaf (a node with no children), just remove it.
If the key is in an internal node then replace it with another key (called a replacement
key) that keeps the B-tree’s order intact.

Replacement key
Find a Replacement Key:
•If the child on the left has enough keys, use the largest key from the left child to replace
the deleted key.
•If the left child doesn’t have enough keys, then try the smallest key from the right child.
•If neither side has enough keys, then
combine both children and delete the key.

Deletion…
If the Key is Not in the Current Node:
•Check which child subtree (left or right) would hold the key.
•Before moving down, make sure the child has enough keys to maintain the B-tree
structure.
•If it doesn’t, borrow a key from an adjacent sibling with enough keys or merge with a
sibling if possible.

Exercise
Try to insert 10,26 to the previous tree
Answer:
Try to delete the key 14 from the previous question.
Answer:

What have we learnt till now…
Why we need B Trees
What are B Trees
Properties of B Trees
Advantages and Disadvantages of B Trees
Searching in B Trees
Insertion, Deletion In B Trees

B+ Tree
Common implementation of B Trees

B+ Tree
Like in B Tree, order of B+ Tree determines its height.
It contains data pointers are only at the leaf nodes of the tree.
B+ Tree is data structure often used when implementing database indexes.
It contains index pages and data pages.
The data pages are always at leaf nodes.
All the root nodes and intermediate nodes are index pages.
A leaf node may store more or less data records than an internal node stores keys.

Features of B+ Trees
Trees have a high fan-out, which means that each node can have many child nodes.
B+ Trees are designed to be cache-friendly, which means that they can take advantage of the
caching mechanisms in modern computer architectures to improve performance.
B+ Trees are often used for disk-based storage systems because they are efficient at storing and
retrieving data from disk.
B+ Trees are self-balancing, which means that as data is added or removed from the tree, it
automatically adjusts itself to maintain a balanced structure.
B+Trees maintain the order of the keys in the tree,

Searching
Searching works like in BST.
Let us suppose we have to find ‘k’ in the B+ Tree.
We will start by fetching from the root node then we will move to the leaf node, which might
contain a record of ‘k’.
We will traverse the tree until we get ‘k’.
If we are unable to find that node, we will return that ‘record not founded’ message.

Insertion
•Every element in the tree has to be inserted into a leaf node. Therefore, it is necessary to go to a
proper leaf node.
•Insert the key into the leaf node in increasing order if there is no overflow.

Insertion Steps
If the tree is empty, add the value to the root.
For every value to be added in a tree, a search is performed to know the location of the
value to be added.
Once the root is full, split the data in 2 leaves, using the root to hold keys and pointers.
If adding an element will split exceed the leaves capacity, then take the median and split
the leaf.

Insertion Exercise
Insert 70 in the tree.
Answer:

Insertion Examples…
Insert 95 in the tree.
Answer

Deletion
Deletion in B+ Trees is just not deletion but it is a combined process of Searching, Deletion, and
Balancing.
In the last step of the Deletion Process, it is mandatory to balance the B+ Trees, otherwise, it fails
in the property of B+ Trees.

Deletion Examples
Delete 70 from the tree
Answer

Deletion Examples…
Delete 25 from the tree
Answer

Advantages of B+ Trees
•A B+ tree with ‘l’ levels can store more entries in its internal nodes compared to a B-tree having
the same ‘l’ levels. This accentuates the significant improvement made to the search time for any
given key. Having lesser levels and the presence of P(next )pointers imply that the B+ trees is
very quick and efficient in accessing records from disks.
•Data stored in a B+ tree can be accessed both sequentially and directly.
•It takes an equal number of disk accesses to fetch records.
•B+ Trees have redundant search keys, and storing search keys repeatedly is not possible.

Disadvantage
•The major drawback of B-tree is the difficulty of traversing the keys sequentially. The B+ tree
retains the rapid random access property of the B-tree while also allowing rapid sequential
access.

Application
•Multilevel Indexing
•Faster operations on the tree (insertion, deletion, search)
•Database indexing

Time Complexity
Operation Time Complexity
Searching O(log
d n)
Insertion O(log
d n)
Deletion O(log
d n)
•In a B+ Tree, searching for a key requires traversing from the root to
a leaf.
•With branching factor d (each node can have up to d children)
and the height of the tree becomes approximately log
d N, where
N is the number of elements.
•Within each node, a binary search on the keys has a time
complexity of O(log
d), but since d is fixed, the complexity is
effectively O(log
d N).

This presentation was made to explain B Trees and B+ Tree to the
students as Part of the Helping others for SIT 221
SIT 221 – Data Structures and Algorithms
Deakin University Chitkara University
By: Tanmay Kataria
(2310994841)

References
https://www.geeksforgeeks.org/introduction-of-b-tree
https://www.javatpoint.com/b-tree