Lecture 2: Introduction to Data Structures
Overview
- Instructor: Eric Domain
- Topic: Data Structures, focusing on sequences and sets.
- Key Areas: Linked lists, dynamic arrays, interfaces (APIs/ADTs), and data structures.
Concepts
Interfaces vs Data Structures
- Interface (API/ADT): Specifies what you want to do (specification).
- Data Structure: Details how you do it (implementation).
- Interfaces specify operations on data.
- Data structures provide algorithms to support those operations.
Sequences and Sets
- Set: Store arbitrary numbers/strings, operations to maintain sorted order, search by key.
- Sequence: Maintain a specific order of elements (e.g., [5, 2, 9, 7]).
Data Structure Tools
- Arrays: Static, require specific size definition.
- Pointers: Used in linked structures.
Static Sequence Interface
- Operations: Build, Length, Iteration, Get, Set.
- Build: Create data structure from items in a specified order.
- Length: Returns the count of items.
- Iteration: Outputs items in order.
- Get/Set: Access or modify items at specific indices.
Static Arrays
- Constant Time Access: Due to consecutive memory allocation.
- Memory Allocation: Cost is linear in the size of the array.
- Word RAM Model: Memory is an array of W-bit words, log N word size suffices for addressing.
Dynamic Sequences
- Key Operations: Insert and Delete.
- Insert At: Add elements at specific indices, shifting others.
- Delete At: Remove elements from specific indices.
- Special Operations: Insert first/last, get/set first/last.
Linked Lists
- Structure: Nodes with item and next pointer.
- Efficiency:
- Insert/Delete First: Constant time.
- Get/Set At: Linear time due to traversal.
Enhancing Linked Lists
- Double Linked Lists: Adds efficiency for "get last" operations.
- Augmentation: Storing additional pointers like tail for optimized operations.
Dynamic Arrays
- Concept: Relax the static size constraint, allowing for growth/shrinkage.
- Implementation: Similar to Python lists.
- Dynamic Size: Maintain array of size
Theta(N), where N is the number of elements.
Operations
- Insert Last: Constant amortized time, double array size when full.
- Efficiency Explanation:
- Exponential Growth: Resize at powers of 2.
- Geometric Series Sum: Total resize cost remains linear.
- Amortized Analysis: Average cost per operation is constant despite occasional linear cost operations.
Summary Table of Operations
| Operation | Static Array | Linked List | Dynamic Array |
|---|
| Get/Set At | Constant | Linear | Constant |
| Insert/Delete First | Linear | Constant | Linear |
| Insert/Delete Last | Linear | Linear | Constant Amortized |
| Insert/Delete At | Linear | Linear | Linear |
Conclusion
Dynamic arrays offer a balanced solution leveraging both static arrays' speed in access and linked lists' dynamism.