📝

CS224 Advanced Algorithms - Lecture Notes

Jul 21, 2024

CS224 Advanced Algorithms - Lecture Notes

Course Information

  • Instructor: Gelani Nelson
  • Teaching Fellow: Jeffrey
  • Contact: Email cs224 df14 D staff at C's. harvard.edu

Administrative Details

  • Website: Google the course or Gelani Nelson's name, linked from his website.
  • Mailing List: Sign up on the course website.
  • Yellow Sheet: Fill it out as it goes around.

Grading Components

  1. Scribing (10%)
    • Take lecture notes using an lwtech template
    • The lectures are recorded for review
    • Notes must be submitted by 9:00 AM the next day
  2. Problem Sets (PSETs, 60%)
    • Submit via email
    • Must be law-teched
    • Page limits apply
  3. Final Project (30%)
    • Proposal due October 30th
    • Project due last day of the reading period
    • Details on the website

Other Logistics

  • Students may need to serve as graders once
  • Required to scribe at least once, possibly twice if the class size decreases
  • Scribe duties and grading are first-come, first-served

Course Goals

  • Analyze and create algorithms
  • Understand and use advanced techniques for algorithm analysis
  • Theory-focused
  • No programming assignments, except for the optional implementation final project

Advanced Algorithm Topics

Comparing with CS124

  • More theory-focused
  • Analyzing new models for efficiency
  • Different measures of efficiency beyond running time and memory

Models of Analysis

  • Running Time: Typically looked at in terms of steps of the algorithm
  • Memory Usage: Minimizing memory
  • Other Parameters: Introducing other models for efficiency

Example: Sorting and Predecessor Problem

Sorting Efficiency

  • Traditional claim: Cannot sort n numbers faster than n log n
  • This lecture: Sorting can be faster than n log n under certain conditions

Static Predecessor Problem

  • Goal: Represent set S = [X1, X2, ..., Xn] and support predecessor queries
  • Predecessor Query: Find max element < Z in set S
  • Space and Speed: Aim for low space, fast query
  • Static vs Dynamic: Static does not change, dynamic allows insertions and deletions

Naive Solutions

  1. Binary Search Tree (BST) Method
    • Static: Sorted numbers with binary search
    • Query Time: O(log n)
    • Dynamic: Use balanced BST like Red-Black Tree
    • Query and Update Time: O(log n)
    • Inefficiency: Linear or higher space complexity

More Efficient Solutions

Van Emde Boas Trees (vEB Trees)

  • Creators: van Emde Boas et al.
  • Time Complexity: Update and query in O(log w) time
  • Space Complexity: Initial U, can be made Θ(n) with randomization
  • Structure:
    • Represent universe size U
    • Root dimension splits into √U clusters and summary
    • Min/Max elements tracked separately

Algorithm Details

  1. Predecessor query
    • Recursively check clusters or summary for predecessors
    • Time: O(log log U)
  2. Insertion
    • Insert by updating Min, and recursively inserting adjusted values
    • Time: O(log log U)

Improvements in Space

  1. Linear Space with Hashing

    • Only store non-empty clusters in hash tables
    • Space Time Complexity: Θ(n)
  2. Use of Dictionaries

    • Store key-value pairs for clusters
    • Dictionary structures allow constant time access and expected constant time insertion, deletion

Fusion Trees (Fredman and Willard)

  • Query Time: log_w n
  • Space: Linear
  • Technique: Bitwise operations and more complex data structures
  • Comparative Operations: Useful in word RAM model (integers, bitwise exor, bit shifting)

Further Reading

  • Han and Thorup's works on sorting algorithms faster than n log n
  • Research on dictionary problems and hashing techniques for better space efficiency

Notes on Lecture Logistics

  • Scribes need to include references to works mentioned, such as vEB trees and Fusion Trees
  • Review of all given data structures and their implication for advanced sorting and searching algorithms

Conclusion

  • Upcoming lecture will cover Fusion Trees in greater detail
  • Matching bounds and further optimization techniques will be discussed