📦

Understanding Apache Kafka Basics

Apr 14, 2025

Lecture on Apache Kafka

Introduction

  • Presenter: Daily Code Buffer
  • Topic: Understanding Apache Kafka
  • Main Points:
    • What is Kafka?
    • Why use Kafka?
    • Kafka architecture
    • Installing and using Kafka in applications

What is Kafka?

  • Definition: Apache Kafka is an open-source communication system between a sender and a receiver.
  • Model:
    • Works on a publish and subscribe model.
    • Sender publishes data to Kafka.
    • Receiver subscribes to Kafka to receive new data.
    • Supports multiple receivers.

Use Cases for Kafka

  • Example: Cab booking application
    • User books a cab; driver is assigned, and location updates are sent in real-time.
    • Apache Kafka handles location updates every second without overloading the database.
    • Kafka is used to manage high throughput and real-time data streaming.

Advantages of Kafka

  • High Throughput: Handles millions of transactions across distributed clusters.
  • Fault Tolerance:
    • Manages replicas across clusters.
    • Leader-follower model for data management.
  • Scalability:
    • Scales to handle large data volumes efficiently.
    • Adds additional servers/clusters as needed.

Kafka Architecture

  • Components:
    • Kafka Cluster: Contains brokers, topics, and partitions.
    • Zookeeper: Manages the Kafka cluster metadata.
    • Brokers: Handle data storage within clusters.
    • Topics and Partitions: Store data and manage offsets.
  • Offsets:
    • Determine data fetch position (earliest or latest).

Installing Kafka

  • Requirements: Java 8+
  • Steps:
    1. Download Kafka from the official website.
    2. Start Zookeeper.
    3. Start Kafka broker.
    4. Create topics to store events.

Using Kafka

  • Producers and Consumers:
    • Producers send messages to Kafka topics.
    • Consumers read messages from Kafka topics.
    • Use command-line tools to create, describe, and verify topics.

Example Application

  • Scenario: Cab booking with two applications (driver and user apps).
  • Driver App: Publishes location data to Kafka.
  • User App: Subscribes to Kafka to receive location updates.
  • Tools: Spring Boot applications connected to Kafka.

Conclusion

  • Apache Kafka is essential for applications requiring high-throughput, real-time data streaming.
  • Widely used in e-commerce, travel, banking, and more.
  • Important tool for developers and beneficial for technical interviews.

Additional Resources

  • Code Samples: Shared in the video description.
  • Further Learning: Recommended for professionals in organizations using Kafka.