Coconote
AI notes
AI voice & video notes
Export note
Try for free
Hashing Lecture Notes
Jul 16, 2024
Hashing Lecture Notes
Introduction
Topic
: Hashing
Previous lecture
: Set data structures (storing items by key, unsorted array, looking up in log n time).
Today
: Discussing the set interface (dynamic find, insert, delete) and hashing for faster operations.
Static Find
Aim: Can we do faster than log n?
Introduced new model:
Comparison Model
(only comparisons between keys are allowed).
Comparison Sort Algorithms
:
Based on key comparisons (equal, less than, greater than).
Represented using a
Decision Tree
, where internal nodes are comparisons and leaves are outputs.
Goal
: Minimum height of binary tree with n+1 leaves (output n items or not found) is log(n).
Direct Access Arrays
Store item in an array at the index of its key.
Performance
: Constant time for find, insert, delete.
Problem
: Prohibitively large space requirement if key range is large (e.g., MIT IDs).
To overcome
: Use more space-efficient structures.
Hashing
Concept
: Map a large key space to a smaller array using a hash function.
Function
: h(key) maps key to an array index (0 to m-1).
Problem
: Collisions (multiple keys mapping to the same index).
Solutions
:
Open Addressing
: Find another place in the array (complicated).
Chaining
: Store a list or other data structure at each array index that manages collisions.
Choosing a Hash Function
Division Method
:
h(k) = k mod m
.
Works well if keys are uniformly distributed.
Universal Hash Function
: More generalized approach to ensure better performance.
Function
:
h_a,b(k) = ((a * k + b) mod p) mod m
(with random a and b).
Ensures key distribution with less probability of clustering.
Universality
: Ensures probability of two keys colliding is ≤ 1/m.
Implication
: Expected chain length is constant (1 + (n-1)/m) if m ≈ n.
Making Hash Tables Dynamic
As the number of stored keys (n) grows, increase m dynamically.
Rebuild
: Recreate hash table with larger m to maintain efficiency, similar to dynamic arrays.
Conclusion
Hashing allows efficient dynamic set operations even when keys are large integers.
Next lecture will focus on faster sorting techniques.
📄
Full transcript