💻

[Lecture 21] Understanding Memory Ordering and Cache Coherence

Apr 9, 2025

Lecture Notes: Computer Architecture

Introduction

  • Topics: Memory Ordering and Cache Coherence
  • Importance: Relevant for multiprocessor and multicore systems
  • Recap: Difficulty in parallel programming; trade-off between performance and correctness

Memory Ordering

Key Points

  • Concept: Important for correct parallel programming in multiprocessor systems
  • Trade-offs: Performance and correctness are at odds
  • Sequential Consistency: Proposed by Lamport, ensures all processors see the same order of operations to memory
  • Memory Consistency: Ordering of all memory operations across processors
  • Coherence vs Consistency:
    • Coherence: Ordering of operations to the same memory location
    • Consistency: Global ordering of operations to all locations

Challenges

  • Program Correctness: Maintaining correct execution in multiprocessors
  • Debugging: Difficult without predictable order of operations

Models

  • Sequential Consistency: Simplifies programmer's job, but costly in terms of hardware design
  • Weak Consistency Models: Compromise made to reduce performance overhead while maintaining correctness
    • Release Consistency: Allows more parallel execution but is harder to implement

Cache Coherence

Key Points

  • Problem: Ensuring all processors see the most recent value of cached memory
  • Hardware Solutions: Preferable over software-only solutions due to complexity
  • Invalidation Protocol: Simple approach where processors invalidate other copies when a cache block is written

Protocols

  • MSI Protocol: Modified, Shared, Invalid states
    • Extend to MESI: Adds Exclusive state for optimization
    • MOESI Protocol: Optimizes further by adding Owner state

Implementation

  • Snoopy Cache: Uses a shared bus for coherence, good for small systems
  • Directory-Based Coherence: Scalable solution for large systems
    • Tradeoffs: More complex but avoids excessive broadcasting

Advanced Topics

  • LazyPIM (Lazy Processing-In-Memory): Specific technique to handle coherence in near-memory computation environments
  • Token Coherence: Combines benefits of Snoopy and Directory approaches

Exam Preparation

  • Review sample exam questions
  • Understand implementation challenges and trade-offs of different models

Conclusion

  • Importance of finding balance between hardware complexity and programmer simplicity
  • The lecture emphasized understanding coherence and consistency as fundamental for parallel computing in multicore systems.