📊

Understanding Apache Kafka and Its Benefits

Nov 13, 2024

Introduction to Apache Kafka

Overview

  • Apache Kafka is middleware for asynchronous event-based communication.
  • Builds large-scale distributed systems using event-based architecture.
  • Components Roles:
    • Producers/Publishers: Generate events/messages (e.g., security camera generating notifications).
    • Consumers/Subscribers: Subscribe and consume specific events (e.g., security dashboard subscribing to security events).
    • Components can be both producers and consumers.

Event-Based Paradigm Benefits

  1. Space Decoupling:
    • Producers and consumers don't need to know each other.
    • New components can be added dynamically without system reconfiguration.
    • Example: Adding new sensors without altering the existing system.
  2. Synchronization Decoupling:
    • Producers are unblocked while generating events; consumers are notified asynchronously.
    • Promotes scalability by removing dependency between producers and consumers.
  3. Time Decoupling:
    • Middleware can store and persist events (e.g., Kafka), allowing producers/consumers to be disconnected temporarily.
    • Consumers can catch up on events once reconnected.

Why Focus on Kafka?

  • Widely adopted standard for event-based communication.
  • Used by 80% of Fortune 100 companies (e.g., LinkedIn, Netflix, New York Times).

History of Kafka

  • Developed at LinkedIn for distributed application logs.
  • Open-sourced in 2011, became an Apache project in 2012.
  • Confluent company founded by original Kafka developers in 2014.
  • Offers additional services and is listed on NASDAQ.

Kafka Features

  • Topic-Based Communication:
    • Messages/events are organized into topics (persistent and replicated for fault tolerance).
    • Consumers can consume events flexibly, even re-consume when necessary.
  • Scalability:
    • Topics are partitioned to enhance access performance and scalability.

Case Study: Microservices Architecture

  • Architecture:
    • Set of independent services with local state and inter-service communication via events.
    • Example: Order service interacting with shipping and customer services.

Event-Based vs. RPC

  • Command as Event:
    • Orders service registers new order event on a topic instead of direct invocation.
    • Benefits: Services are decoupled, allowing dynamic addition/removal without interruption.
  • Query as Event:
    • Shipping service observes customer updates asynchronously instead of synchronous requests.
    • Benefits: Eliminates synchronous tight coupling and improves latency.

Key Concepts

  1. Event Sourcing:
    • Events are the core elements, source of truth.
    • State is derived from events, enabling reconstruction of past states.
  2. CQRS (Command Query Responsibility Segregation):
    • Separates command (write) paths from query (read) paths.
    • Connects them through asynchronous channels.