💾

[Lecture 15] Understanding SSD Architecture and Management

Apr 9, 2025

Lecture 15: Computer Architecture - Storage Architecture

Introduction

  • Topic: Flash memory and solid-state drives (SSDs)
  • Focus: Storage architecture and SSD management
  • Background: Some background on flash memory; deep dive tomorrow
  • Research: Mention of MQSim simulator and research papers
  • Projects: Ongoing projects in this area for interested students

SSD Organization

  • Components:
    • Multiple cores, hardware controllers
    • DRAM and NAND flash memory packages
    • LPDDR DRAM
    • SSD controller with cores and flash controllers
  • Flash Controllers:
    • Handle requests, ECC, randomizing, encryption

SSD Controller Components

  • Host Interface Layer (HIL): Implements protocol interface (e.g., SATA, NVMe)
  • Flash Translation Layer (FTL): Manages data caching, address translation, garbage collection, wear leveling
  • Flash Controllers: ECC, randomizing, manage NAND packages
  • DRAM Module:
    • Hosts request queues
    • Write buffer
    • Logical to physical mapping
    • Metadata regarding program/erase cycles

Read and Write Operations

  • Write Operation:

    • Use of write buffer to reduce latency
    • Flexible i/o scheduling
    • Address translation (out-of-place writes)
    • Importance of logical to physical mapping
    • Garbage collection and wear leveling
  • Read Operation:

    • Check write buffer for data
    • Address translation if data not in buffer
    • ECC decoding and de-randomization
    • Retry reading with adjusted reference voltage if ECC fails

Flash Cell Characteristics

  • Basic Operation:
    • Program by injecting electrons (higher threshold voltage)
    • Erase by tunneling electrons back
  • Endurance Problem: Imperfect program/erase processes
  • Threshold Voltage: Determines the state (0 or 1)

NAND Flash Structure

  • NAND String: Series of flash cells connected serially
  • Page and Block: Arrangement in NAND flash
  • Multilevel Cell (MLC): Store multiple bits per cell
    • More levels = more bits
    • Reduces reliability due to narrower threshold states

Performance Evaluation

  • SSD Performance Metrics:
    • Latency, throughput, bandwidth
  • NAND Flash Performance Parameters:
    • Sensing time, programming time, erase time
    • I/O rate
  • Optimizations:
    • Subpage sensing
    • Random data out
    • Cache read command
    • Multiplane operations

Address Mapping and Garbage Collection

  • Mapping:
    • Logical to physical mapping
    • Handling updates with out-of-place writes
  • Garbage Collection:
    • Reclaim free pages by erasing invalid pages
    • Performance and lifetime overhead
    • Strategies to improve efficiency

Fine-Grain Mapping

  • Handling Small Writes:
    • Fine-grain mapping to reduce operations
  • Issues with Small Writes:
    • Inefficiency due to erase-before-write property
    • Read-modify-writes increase overhead

Multi-Plane Operations

  • Superblock Based Management:
    • Group blocks across planes for parallel operations
    • Improved performance through multiplane writes and reads

I/O Scheduling

  • NVMe Protocol:
    • Removes OS stack for direct communication
    • Increases fairness and performance challenges
    • Requires internal SSD-level scheduling strategies

Conclusion

  • Numerous challenges and strategies involved in SSD management
  • Importance of ongoing research to enhance performance and reliability

Next Steps

  • Continue with NVMe discussions and specific strategies for I/O scheduling
  • Explore deeper into flash memory reliability tomorrow

Note: The lecture involved detailed technical content on SSD architecture, operation mechanisms, and internal management challenges.