Overview of System Design Concepts

Jul 31, 2024

System Design Tutorial Summary

Introduction

  • Covers system design concepts: scalability, reliability, data handling, high-level architecture
  • Focuses on system design interviews
  • Emphasizes understanding overall system, not just coding

Understanding Computer Systems

Basic Components

  • Binary System: Computers understand only 0s and 1s
    • Bit: Smallest data unit (0 or 1)
    • Byte: 8 bits (Represents a character or number)
  • Storage Units: Kilobyte, Megabyte, Gigabyte, Terabyte
  • Disk Storage: Primary data holder (HDD or SSD)
    • HDD vs SSD: SSDs are faster (500-3500 MBps) than HDDs (80-160 MBps)
  • RAM: Volatile memory, active data holder
    • Size: Few GBs to hundreds of GBs
    • Speed: 5000+ MBps
  • Cache: Faster than RAM, smaller size, stores frequently used data
  • CPU: Executes instructions, processes operations
  • Motherboard: Connects all components

High-Level Architecture of Production-Ready Apps

CICD Pipeline

  • Continuous Integration and Continuous Deployment
  • Tools: Jenkins, GitHub Actions

Handling User Requests

  • Load Balancers: Distribute requests evenly
  • External Storage: Separate from production server
  • Logging and Monitoring: Tools like PM2 (backend), Sentry (frontend)

Failure Handling

  • Alerting Systems: Integrated with platforms like Slack
  • Debugging: Use logs, replicate issues in a safe environment, hot fixes

Key Principles in System Design

Scalability, Maintainability, Efficiency

  • Plan for failure, build a resilient system

Moving, Storing, Transforming Data

  • Moving Data: Optimize speed and security
  • Storing Data: SQL vs NoSQL, access patterns
  • Transforming Data: Turning raw data into meaningful information

CAP Theorem

  • Consistency, Availability, Partition Tolerance: Can only achieve two of the three
    • Example: Banking system needs Consistency and Partition Tolerance

Availability and Reliability

  • Availability: Measure of uptime
    • High availability (e.g., 99.999% uptime)
  • Reliability, Fault Tolerance, Redundancy: Ensure the system works correctly, handle failures, backups
  • Performance: Throughput (requests per second, queries per second), latency

Networking Basics

IP Addressing and Data Packets

  • IPV4 vs IPV6: IPV4 (32-bit), IPV6 (128-bit)
  • Packets: Contain IP headers
  • DNS: Translates domain names to IP addresses
  • TCP vs UDP: TCP (reliable), UDP (faster, less reliable)

Application Layer Protocols

  • HTTP: Stateless protocol, request-response model
  • Websockets: Two-way communication
  • SMTP, IMAP, POP3: Email protocols
  • FTP, SSH: File transfer protocols
  • WebRTC, MQTT, AMQP: Real-time communication
  • RPC: Remote procedure calls, execute code on remote servers

API Design

CRUD Operations

  • Create, Read, Update, Delete (CRUD)
  • RESTful APIs, GraphQL, gRPC

Versioning and Rate Limiting

  • Maintain backward compatibility
  • Set rate limits to prevent abuse

Caching and CDNs

Types of Caching

  • Browser Caching: Local storage on user’s computer
  • Server Caching: In-memory or disk-based caching
  • Database Caching: Cache query results
  • CDNs: Geographically distributed servers for static content

Benefits

  • Reduced latency, lower server load, improved user experience

Proxy Servers

Types and Usage

  • Forward Proxy: Controls client access to resources
  • Reverse Proxy: Manages traffic to servers
  • Load Balancing: Distributes traffic to multiple servers

Database Essentials

Types of Databases

  • SQL Databases: Relational, ACID compliant
  • NoSQL Databases: Flexible, unstructured data
  • In-Memory Databases: Fast data retrieval

Scaling Databases

  • Vertical Scaling: Enhance individual server capabilities
  • Horizontal Scaling: Add more servers
    • Sharding, Data Replication

Performance Techniques

  • Caching, Indexing, Query Optimization

Conclusion

  • Prioritize requirements, use appropriate design principles
  • Importance of getting design right from the start