Introduction to Disjoint Set Data Structure
Importance
- Commonly asked in interviews & online assessments
- Mastery required by watching this video and solving follow-up problems
Why Use Disjoint Set Data Structure?
- To determine if two nodes belong to the same component in constant time
- Useful in dynamic graphs, which change their configuration frequently
Example
- Given graph components: [1, 2, 3, 4] and [5, 6, 7]
- Traditional brute force using DFS/BFS takes linear time (O(N+E))
- Disjoint Set Data Structure answers in constant time
Key Operations
- Find Parent: Determine the parent node of a given node
- Union: Connect two components
- Methods: By rank or by size
Implementation of Disjoint Set Data Structure
Initial Configuration
- All nodes are their own parents
- Rank/Size initialized (rank to 0 and size to 1)
Union by Rank
- Compare ranks of parent nodes
- Attach smaller rank tree under root of higher rank tree
- Equal rank -> attach any and increment rank
Union by Size
- Similar to rank, but using the size of components
- Attach smaller size tree to the root of the larger size tree
- Equal size -> attach any and increment size
Path Compression
- Used in Find Parent operation
- Flattens the structure by making nodes point directly to the root
- Reduces time complexity to nearly constant
Coding Implementation
Class Structure
- Implemented in a reusable class format
- Constructor initializes parent and rank/size arrays
Pseudocode
- Find Parent with Path Compression:
if node == parent[node]:
return node
else:
parent[node] = find_parent(parent[node])
return parent[node]
if rank[pu] < rank[pv]:
parent[pu] = pv
elif rank[pu] > rank[pv]:
parent[pv] = pu
else:
parent[pv] = pu
rank[pu] += 1
if size[pu] < size[pv]:
parent[pu] = pv
size[pv] += size[pu]
else:
parent[pv] = pu
size[pu] += size[pv]
Time Complexity
- Both Union operations (by Rank and by Size) achieve nearly constant time complexity, O(α(N)), where α(N) is the inverse Ackermann function, which grows very slowly and is almost constant.