Overview of Data Structures and Algorithms

Jul 31, 2024

Lecture Notes on Data Structures and Algorithms

Introduction to Data Structures

  • Definition: A data structure is a way of organizing data so it can be used efficiently.
  • Importance:
    • Essential for creating fast algorithms.
    • Helps manage and organize data naturally.
    • Makes code cleaner and easier to understand.

Abstract Data Types

  • Abstract Data Type (ADT): An abstraction that provides an interface for a data structure without specifying implementation details.
  • Example: Modes of transportation (walking, biking) are like data structures; the abstract type is the concept of transportation.

Computational Complexity

  • Key Questions:
    • How much time does an algorithm need to finish?
    • How much space does it require?
  • Big O Notation: Used to express the worst-case scenario for time and space complexity.
    • Common Complexities:
      • Constant Time: O(1)
      • Logarithmic Time: O(log n)
      • Linear Time: O(n)
      • Quadratic Time: O(n^2)
      • Exponential: O(2^n)

Data Structures Overview

Arrays

  • Static Arrays: Fixed length data containers that are contiguous in memory.
    • Usage: Store objects, buffers, lookup tables, dynamic programming.
  • Dynamic Arrays: Can grow or shrink in size, often implemented with static arrays.
    • Complexity:
      • Access: O(1)
      • Search: O(n) worst-case
      • Insert/Delete: O(n) (due to shifting), O(1) for append.

Linked Lists

  • Structure: Nodes with pointers, can be singly or doubly linked.
  • Usage: Efficient for insertions/deletions compared to arrays.
    • Complexity:
      • Access: O(n)
      • Insert/Delete: O(1) if node reference is known.

Stacks

  • Definition: Last In First Out (LIFO) structure.
  • Operations: Push, Pop, Peek.
  • Complexity: O(1) for all operations.
  • Applications: Undo mechanisms, parsing expressions.

Queues

  • Definition: First In First Out (FIFO) structure.
  • Operations: Enqueue, Dequeue, Peek.
  • Complexity: O(1) for all operations.
  • Applications: Task scheduling, breadth-first searching.

Hash Tables

  • Definition: Maps keys to values using a hash function to ensure efficient access.
  • Collision Resolution Techniques:
    • Separate Chaining: Uses linked lists to store multiple items at the same index.
    • Open Addressing: Probes through the table to find an open slot.
  • Complexity:
    • Average case: O(1) for insert, delete, search.
    • Worst case: O(n) if many collisions occur.

Trees

  • Binary Trees: Each node has at most two children.
    • Binary Search Trees (BST): Nodes are arranged such that left children are smaller, right children are larger.
  • Traversals:
    • Pre-order, In-order, Post-order, Level-order.
    • In-order traversal of BST yields sorted order of elements.

Fenwick Trees

  • Purpose: Efficient range query and point update operations.
  • Operations: Insert, update, query in O(log n) time.
  • Construction: Can be built from an array in O(n) time.

Unique Substrings via Suffix Arrays

  • Suffix Array: An array of indices that represent sorted suffixes of a string.
  • LCP Array: Tracks the longest common prefix of adjacent suffixes.
  • Finding Unique Substrings: Use the LCP values to deduce duplicates and count unique substrings.

Union-Find Data Structure

  • Purpose: Tracks a set of elements partitioned into disjoint sets.
  • Operations: Union and Find.
  • Applications: Useful in network connectivity, Kruskal's algorithm for minimum spanning trees.
  • Complexity: Amortized O(1) for union and find operations with path compression.

AVL Trees

  • Definition: A self-balancing binary search tree where the difference in heights of left and right subtrees is at most one.
  • Operations: Insert, delete, search all in O(log n) time.
  • Rotations: Used to maintain balance after insertions/deletions.

Indexed Priority Queue

  • Purpose: Allows for priority queues with efficient key-value updates and deletions.
  • Operations: Insert, remove, update in O(log n) time.
  • Applications: Useful in scheduling algorithms, event simulation.

Conclusion

  • Understanding and mastering these data structures and their complexities is crucial in computer science, algorithms, and programming.