🎬

Netflix System Design Summary

Jun 11, 2025

Summary

  • The meeting was a mock interview focused on designing the core product of Netflix, specifically around users, videos, and supporting a recommendation engine with user activity data.
  • The discussion covered system requirements, scalability estimates, storage solutions, caching, sharding, global content delivery, and user experience considerations.
  • Andreas, a software engineer at Modern Health, was the interviewee; the interviewer provided feedback and suggestions throughout.
  • Key architectural decisions, trade-offs, and system design principles such as CAP theorem, caching strategies, and CDN use were discussed in detail.

Action Items

  • No concrete follow-ups or assignments were specified for after the mock interview.

Core Product Requirements & High-Level Design

  • The design should focus on handling users, videos, and gathering user activity data for the recommendation engine, ignoring features like search and subscriptions.
  • The recommendation engine’s algorithm is out of scope; the focus is on data gathering and processing.

Estimations & Data Types

  • Assumed 200 million users and 10,000 videos, each averaging an hour in length, with both SD (10 GB/hour) and HD (20 GB/hour) versions, equating to roughly 300 TB of video storage.
  • Core data types: Video content files, video static content (titles, descriptions, thumbnails, cast), user metadata (watch history, likes, resume-points), and user activity logs (clicks, impressions, etc.).

Storage Architecture

  • Video Content: Store in cloud blob storage (e.g., Amazon S3, Google Cloud Storage) for easy scaling and global availability. Justified due to large file sizes and fixed video library size.
  • Static Video Content: Store in a relational database (e.g., PostgreSQL). Cache frequently accessed static content for faster reads.
  • User Metadata: Store in a sharded relational database system (e.g., sharded PostgreSQL) indexed by user ID to handle scalability and high query efficiency. Considered sharding by user ID to balance load and improve performance.
  • Caching for both static content and user metadata should use in-memory solutions (e.g., Redis or Memcached) to increase speed.

API & Scalability

  • Horizontally scalable API services behind a load balancer to distribute user request load.
  • Cache should implement write-through strategies on frequently accessed data to further reduce latency.

Global Distribution & Latency

  • Address global user base and low latency requirements by serving video content through a Content Delivery Network (CDN), populated based on geographical demand.
  • CDN population is managed by a background job, ensuring popular content is distributed close to users.

Design Trade-Offs & Rationale

  • Chose relational databases over NoSQL for complex querying capabilities and compatibility with background jobs, despite increased complexity with sharding.
  • Accepted trade-offs in consistency to prioritize availability and partition tolerance, aligning with CAP theorem considerations for a globally distributed, high-availability system.

Debrief & Feedback

  • Andreas reflected that his calculations and system breakdowns were strong, but the presentation flow could have been improved by separating estimations from the system design phase.
  • The interviewer highlighted good attention to user experience, caching, and global considerations, as well as strong use of trade-off analysis (CAP theorem).
  • Suggestions included starting with higher-level component diagrams after requirements gathering, then moving into detailed estimations.

Decisions

  • Use blob storage for video content — Justified by file size, fixed catalog, and scalability.
  • Store static content and user metadata in sharded relational databases — Enables complex queries; user-based sharding optimizes for expected access patterns.
  • Leverage global CDN for video delivery — Ensures low latency and good user experience worldwide.

Open Questions / Follow-Ups

  • None raised; all discussion points were addressed within the session.