Coconote
AI notes
AI voice & video notes
Export note
Try for free
System Design Discussion on YouTube
Jul 10, 2024
Lecture Notes: System Design Discussion on YouTube
Introduction
Speaker
: Kitty
Experience
: 5 years, working in industry since 2018
2 years at Goldman Sachs (pre-designed systems)
Freelancing (ad hoc system design)
System Design
: Beginner level
Topics Covered
Overview of YouTube system design
Functional and non-functional requirements
Capacity and storage estimation
APIs for upload and watch
Database choices and architectural decisions
CDN and caching
Live streaming protocols
Adaptive Bitrate Streaming (ABR)
Functional Requirements of YouTube
User Authentication
Video upload and storage
Subscriptions
Live streaming
Video watching (normal and live)
Search
Recommendation system
Non-Functional Requirements
Availability over Consistency
(CAP Theorem)
Low Latency
Fault Tolerance
Reliability
Capacity and Storage Estimation
Active Users
: 2.5 billion/day
Watched Videos
: 25 billion/day (~300k videos/second)
Upload to View Ratio
: 1:100
Video Size Assumption
: 1GB per video
Storage Requirement
: ~3TB/sec
API Design
Upload API
Method:
POST /upload
Headers: Authentication details (e.g., API key)
Body:
Title
Description
Tags
Video File (stream)
Response:
202 Accepted
Watch API
Metadata Fetch
Method:
GET /metadata
Headers: Authentication details (e.g., API key)
Params: Video ID
Response: Metadata (e.g., video info, URL, etc.)
Video Streaming
Method:
GET /watch
Headers: Authentication details (e.g., API key)
Params:
Video ID
Offset (time position)
Response: Video Chunk
Database Choices
User Info
: SQL database for structured, frequent queries (e.g., PostgreSQL)
Video Content
: Object storage (e.g., AWS S3)
Metadata
: NoSQL database for high write throughput and scalability (e.g., Cassandra or MongoDB)
Thumbnails
: Read-heavy database (e.g., Google Bigtable)
Architectural Components
Load Balancer
: First point of contact for client requests
Upload Service
: Handles video uploads, splitting
Splitter Service
: Separates video into chunks
Processing Queue
: Kafka or RabbitMQ for processing video chunks
Encoder
: Encodes video chunks into various bitrates
CDN and Caching
CDN
: Content Delivery Network for geographically distributing content to reduce latency (e.g., CloudFront)
Caching
: Ensure metadata and thumbnails are cached to reduce response time
Live Streaming
Streams Protocol
: Uses TCP for live streaming instead of UDP for reliability
Adaptive Bitrate Streaming (ABR)
Protocols: HTTP Live Streaming (HLS) and MPEG-DASH
Allows switching bitrates based on network conditions
Uses Manifest files (.m3u8) for chunk information
Questions and Further Discussion
Upload communication: Asynchronous upload handling using HTTP keep-alive
Practical example: Network tab inspection in Chrome for chunk requests
Final thoughts on system design, open-sourcing a YouTube-like platform
📄
Full transcript