📈

Understanding Scalability in System Design

Oct 28, 2024

Notes on Scalability in System Design

Introduction to Scalability

  • Scalability is crucial for applications that may experience sudden traffic surges.
  • A scalable system can handle increased loads by adding resources without compromising performance.

Definition of Scalability

  • Core Concept: A system's ability to manage increased load efficiently.
  • Focus on cost-effective strategies to extend capabilities rather than just surviving increased demand.

Key Considerations for Scalability

  • Coordination: Adding resources (processors/servers) requires effective coordination.
  • Performance Overhead: Must consider if coordination overhead negates performance gains.

Comparing Scalability

  • Use response vs. demand curves to compare system scalability.
    • X-axis: Demand
    • Y-axis: Response time
    • A more scalable system has a less steep curve.

Limits of Scalability

  • No system is infinitely scalable; each has limits.
  • Tipping point appears as a knee in the response vs. demand curve where performance degrades.
  • Goal: Push the tipping point as far right as possible.

Causes of Scaling Bottlenecks

  • Centralized Components: E.g., a single database server limits simultaneous requests.
  • High Latency Operations: Time-consuming tasks that slow down overall response time.

Building Scalable Systems

Key Principles

  1. Statelessness

    • Servers do not hold client-specific data between requests.
    • Enhances horizontal scaling and fault tolerance.
    • For stateful applications, externalize state to distributed caches/databases.
  2. Loose Coupling

    • Designed system components operate independently with minimal dependencies.
    • Allows for specific parts to be scaled without affecting the entire system.
  3. Asynchronous Processing

    • Use event-driven architecture for non-blocking operations.
    • Reduces tight coupling and risk of cascading failures, but increases complexity in error handling.

Scaling Strategies

Vertical vs. Horizontal Scaling

  • Vertical Scaling (Scaling Up)

    • Involves increasing the capacity of a single machine.
    • Suitable for applications with specific requirements but has limitations and high costs.
  • Horizontal Scaling (Scaling Out)

    • Adding more machines to share workload.
    • Better fault tolerance and often more cost-effective for large-scale systems.
    • Challenges include data consistency and complexity in distributed systems.

Techniques for Building Scalable Systems

  1. Load Balancing

    • Directs incoming requests to the most capable servers.
    • Uses algorithms like round-robin, least connections, or performance-based methods.
  2. Caching

    • Stores frequently accessed data to reduce latency and backend load.
    • Implementing a content delivery network can improve response times.
  3. Sharding

    • Splitting large datasets for parallel processing across servers.
    • Choose sharding strategies based on data access patterns.
  4. Avoid Centralized Resources

    • Centralized components become bottlenecks.
    • Use multiple queues, break long tasks into smaller tasks, and design patterns to distribute workloads.
  5. Modularity

    • Create loosely coupled independent modules with defined interfaces.
    • Avoid monolithic architectures to maintain scalability and flexibility.

Ongoing Process of Scalability

  • Monitoring is essential: CPU, memory, network bandwidth, response times, throughput.
  • Continuously reassess design decisions and adapt architecture to evolving needs.