Links to previous explanations of multilevel indexing and different types of indexes in the description box.
Key Concepts
Index Types
Primary Index: Directly maps to records.
Clustered Index: Data is physically organized according to the index.
Secondary Index: Additional indexes that reference primary keys.
Dense Index: Contains an entry for every key in the data record.
Sparse Index: Contains an entry for only some of the keys.
Key Value and Pointer
Key Value: The unique identifier for searching (e.g., roll number).
Pointer: Reference to the actual data location in secondary memory.
Example: In a book, the key value is the topic name, and the pointer is the page number.
Multilevel Indexing
In larger datasets, multiple levels of indexes may be created.
Inserting or deleting records requires updates in all relevant indexes, complicating management.
B-Tree Structure
Tree Basics
Nodes and Edges: Nodes are connected by edges; no cycles exist (unlike graphs).
Types of Nodes:
Root Node: The top node of the tree.
Intermediate Nodes: Nodes between root and leaves.
Leaf Nodes: The bottom nodes with no children.
Balanced Tree
A B-tree is a balanced tree, where all leaf nodes are at the same level.
Unbalanced trees have leaf nodes at different levels.
Nomenclature in B-Tree
Pointers
Block Pointer: Used to reference child nodes.
Record Pointer: Points to actual data in secondary memory.
Keys
Multiple keys can be stored in a single node.
Each key corresponds to a record pointer, which points to detailed data.
Order of B-Tree
Order (P): Maximum number of children a node can have.
Node Requirements:
Root Node: Can have P children.
Intermediate/Leaf Nodes: Can have between ⌈P/2⌉ and P children.
Example of Order 4 B-Tree
Maximum Children: 4 (for root and intermediate nodes).
Maximum Keys: P-1 (3 for order 4).
Record Pointers: Equal to keys (also 3).
Insertion and Search
Insert keys in sorted order to maintain binary search properties.
Record pointers refer to data in secondary memory based on key values.
Conclusion
B-trees are effective for multilevel indexing, balancing the need for organized access to large datasets while enabling efficient searches and updates.