🧠

[Lecture 12] Exploring Processing in Memory Technologies

Apr 9, 2025

Lecture Notes on Processing in Memory and Processing Near Memory Architectures

Introduction

  • Presenter: Geraldo, PhD student at Safari Research Group.
  • Topic: Deep dive into processing in memory (PIM) and processing near memory (PNM).
  • Lab Announcement: Lab 3 will be released online later today.

Background and Problem Statement

  • Challenge: Data movement bottlenecks in modern systems hinder performance and energy efficiency.
  • Causes:
    • Lack of data locality.
    • Insufficient memory bandwidth.
    • High latencies to main memory or storage.

Trends in Applications

  • Applications like neural networks and transformers have increased data usage.
  • Energy consumption concerns in devices like phones and laptops.

Traditional System Design

  • Historically, systems are designed with a compute-centric approach.
  • Resulted in efficient processors but inefficient data movement strategies.

Processing Near Memory (PNM)

  • Concept: Move compute resources closer to the memory.
  • Benefits: Larger memory bandwidth, abundant parallelism, and shorter memory access latency.
  • Types:
    • Processing near memory architectures.
    • Processing using memory architectures.

Processing Using Memory Architectures

  • Focus: Utilize memory itself for computation using analog principles.
  • Example Architectures: Pro Memory
    • Benefits: Access to larger bandwidth and parallelism.
    • Challenges: Destructive operations, limited registers, and complex programming.

Processing in Memory (PIM) Architectures

  • Approach: Simulate compute operations within the memory itself.
  • Techniques:
    • In-memory row copies.
    • In-memory majority operations.
    • Arithmetic and transcendental functions using lookup tables.

Challenges and Solutions

  • Challenges:
    • Fixed granularity leads to underutilization.
    • Lack of support for reduction operations.
    • Complex programming models.
  • Solutions:
    • Flexible SIMD architectures with fine-grain control.
    • Compiler and software support to ease programming.

Real-World Architectures

  • UpMem: First commercially available PNM, operates in a dim form factor.
  • Samsung's HBM-PIM: Uses 3D stacked HBM, promotes bank-level parallelism.
  • SK Hynix's AiM: Similar approach, targets AI workloads.
  • Alibaba's HBM: Utilizes hybrid bonding for advanced integration.

Programming Models

  • UpMem SDK:
    • Manual data and kernel management.
    • APIs for data transfer and execution.
  • Abstracted Frameworks:
    • DAAC framework for easier programmability.
    • TransLi for transcendental functions.

Conclusion

  • Future Directions:
    • Address data layout conversion issues.
    • Improve execution models for throughput-oriented tasks.
    • Explore space-efficient lookup table computations.
  • Research Opportunities:
    • As more PIM and PNM solutions are deployed, research in this field is becoming increasingly relevant and impactful.

  • The lecture covers an extensive overview of both processing in and near memory technologies, challenges faced, solutions implemented, and future directions. This field is rapidly evolving with significant industry interest and academic research.