Comparing Kafka and RabbitMQ Systems

So in this video, we're going to be talking about the difference between Kafka and RabbitMQ, or more broadly, stream processing systems and traditional message queues. With distributed systems, a common mistake is thinking that these two systems are interchangeable, but they actually solve very different purposes, and using one of them when you should be using the other can cause a lot of problems down the road. So let's take a look at the main differences in their design. Kafka is inherently a stream processing system, so it's designed for taking in a stream of events and sending them off to a bunch of different consumers of those events. Kafka has very high throughput. and it keeps all the messages around until their time to live expires, so even messages that have already been consumed can still be replayed later. Kafka is also fan-out by default, so if we have multiple consumers connected to the same queue, each consumer will get one copy of each record. Traditional queues such as RabbitMQ on the other hand are designed to be message queuing systems, so they're designed to take in messages, queue them until they're ready to be processed, and then send them off to the processor. Traditional queues can handle complex message routing, so this is really handy if we want to be able to route a specific message based on its properties to specific queues. Traditional queues also generally have a message that is destined for one consumer. So whereas Kafka is fan-out, with a traditional message queue, if you have multiple consumers connected to one queue, each message will be routed to exactly one consumer, not every consumer. Traditional message queues are designed for moderate data volumes. They're still very fast, but they don't handle the same amount of throughput that Kafka does. So those are some high-level differences between how these two systems are designed, but now let's dive into the lower-level details of what makes that the way it is. So the first thing we're going to take a look at here is consumer patterns. So as we mentioned before, Kafka is fan-out while traditional message queues are not. So if we have, for example, three consumers connected to one queue, and we send three messages into that queue, each consumer will receive all three messages. With RabbitMQ, on the other hand, each consumer will receive one of the three messages. So Kafka and RabbitMQ both support doing this the other way around as well, but it requires a little bit more setup and isn't quite as scalable. So the Kafka pattern of fanning out is really useful for things where we have events coming in and we want to distribute them to multiple independent services. So for example, we could have a stream of events coming in and we want to send those out to logging. We want to do some data analytics on those events and we want to send real-time updates to our users. Each event that's coming in needs to be processed by each one of these services. However, the individual job of these services is relatively small and likely doesn't need to be scaled horizontally too much. This means that most of our messages are simply fanning out to each of these three consumers. RabbitMQ, on the other hand, is great for situations where we have messages coming in, and we need to process those messages. So in this example, we have two processors, and any message that comes in needs to be read by one of these two processors. We could expand these processors out to hundreds of replicas, and we wouldn't really have any problems here. because RabbitMQ will handle distributing each message to exactly one of those processors. If you want to see some more examples of how distributed queues are used in real-world situations, you should check out our Systems End-to-End course on interviewpen.com, where we have a ton of in-depth examples of how these systems are used. All right, so the next thing we're going to take a look at here is message routing. So with Kafka, all message routing is handled by the producer. So a Kafka setup can consist of multiple queues organized into topics and partitions, and the producer is solely responsible for determining which queue its data goes into. This has the advantage of the Kafka cluster itself not having to do a lot of work, which is part of what allows it to handle such high throughput. On the producer side, we're able to send messages to one or many queues based on properties of that message. And if we have a situation where we don't want to fan out, and we instead want to have multiple partitions and multiple consumers that each get one message, we can do that as well on the producer side by having the producer hash the message to determine which partition it goes into. This also enables us to scale our Kafka cluster better because there's no single point where all messages have to go through to be routed. One of the problems with this, however, is that we have no control after the message has been produced of where it actually goes. RabbitMQ, on the other hand, introduces exchanges, which take in all of the messages and route them to different queues. This exchange can do things like routing a message to one of two queues based on properties of that queue, and it can also handle duplicating messages between multiple queues to enable a fan-out style approach. What's nice about this is our consumers now have control over what messages they're consuming from this queue. So in a situation where we're not fanning out, this enables us to balance the load between multiple consumers much better, especially for tasks that might have variable degrees of time associated with them. So let's look at what this actually means for whether or not you should use Kafka or RabbitMQ. Kafka is designed to take in uniform messages that require a short time to process. So these are things like streaming events where we want the data to go to multiple places at once and decouple systems. but there's not a huge cost associated with actually processing a single message. Kafka does well when messages are being fanned out to many systems. So if we have multiple independent systems that are looking at the same pieces of data, Kafka's a good fit. Kafka is also very, very fast. So if we really need extremely high throughput, Kafka is going to be a necessity. Now, traditional message queues like RabbitMQ, on the other hand, are great for long running tasks that we don't know how long they're going to take to complete. Traditional queues can also handle complex routing really well, and this can be useful for certain... in situations. Traditional queues are also really good with sporadic or bursty data flow. Kafka is designed to have consistent data moving through the system at all times, and the traditional message queue model tends to work well when we don't have that. Now the final piece that we're going to talk about here is acknowledgement. So if something goes wrong and a consumer fails to process a message, we need some way to be able to retry and send that message off to a different consumer. So that's where acknowledgements come into play. So with the Kafka model, we don't actually have acknowledgements, we instead have offsets. So Kafka logs an offset of how many messages each consumer has received so far, and whenever a consumer needs new data, it fetches data from the queue from that offset, so it's getting all new data from the last position it read from. Once it's done processing that batch of data, it then commits its offsets to tell Kafka that it actually successfully processed that data. If a consumer disconnects before it commits its offsets, Kafka will be able to automatically send that data to another consumer, because that consumer will just pick up from the last committed offset. RabbitMQ, on the other hand, actually does have acknowledgements. So a consumer is just going to pull the queue for new data, and when it finishes processing that data, it sends that acknowledgement back to the queue to tell it that it successfully processed that record. RabbitMQ will actually just wait for an acknowledgement, and if it doesn't receive one in a certain period of time, then it'll go and send that data out to another consumer. So these two approaches accomplish a similar goal, but the traditional RabbitMQ model tends to work better when we have long-running tasks, and we need to acknowledge those tasks as completed or failed. The model of committing offsets with Kafka is great when we have to process batches of data when we have a large quantity of small events coming in. A lot of systems that people use Kafka for can also tolerate messages being dropped to some extent. So to recap, let's take a look at some use cases and whether or not you would use Kafka or RabbitMQ for those use cases. So Kafka, being a stream processing system, is really good at stream data analysis. So if we have a stream of real-time data coming in and we want to do some analysis on that data, as it comes in... there's a lot of tools that are able to do that using Kafka. Kafka is also great for the event bus model, where we have events coming in from different pieces of a system, and we want to send those events out to multiple different independent systems that all want to capture that data. Kafka is also used a lot for logging, where we have a stream of logs coming into our system, and we want to capture those logs in a database, and maybe do some other processing on them in different systems. This is a good example of where we have a consistent stream of a lot of small data coming in. And finally, Kafka is good at real-time communication, so for when we want to stream events to our users in real-time. Traditional queues are great for when we have messages that actually do need to be queued, so for example, a job worker system. So we might have a cluster of a ton of different workers that are all processing jobs, and we want to queue those messages and allow the workers to process them one at a time as they're ready. Traditional queues are also great for decoupling microservices when we just need to have simple communication between two different services. RabbitMQ will handle all of the error handling and scaling challenges associated with that behind the scenes. If you enjoyed this video, you can find more content like this on interviewpen.com. We have tons of more in-depth system design and data structures and algorithms content for any skill level, along with a full coding environment and an AI teaching assistant. You can also join our Discord, where we're always available to answer any questions you might have. If you or a friend wants to master the fundamentals of software engineering, check us out at interviewpen.com.

Kafka is inherently a stream processing system, so it's designed for taking in a stream of events and sending them off to a bunch of different consumers of those events. Kafka has very high throughput. and it keeps all the messages around until their time to live expires, so even messages that have already been consumed can still be replayed later.

Kafka is also fan-out by default, so if we have multiple consumers connected to the same queue, each consumer will get one copy of each record. Traditional queues such as RabbitMQ on the other hand are designed to be message queuing systems, so they're designed to take in messages, queue them until they're ready to be processed, and then send them off to the processor. Traditional queues can handle complex message routing, so this is really handy if we want to be able to route a specific message based on its properties to specific queues. Traditional queues also generally have a message that is destined for one consumer.

So whereas Kafka is fan-out, with a traditional message queue, if you have multiple consumers connected to one queue, each message will be routed to exactly one consumer, not every consumer. Traditional message queues are designed for moderate data volumes. They're still very fast, but they don't handle the same amount of throughput that Kafka does. So those are some high-level differences between how these two systems are designed, but now let's dive into the lower-level details of what makes that the way it is.

So the first thing we're going to take a look at here is consumer patterns. So as we mentioned before, Kafka is fan-out while traditional message queues are not. So if we have, for example, three consumers connected to one queue, and we send three messages into that queue, each consumer will receive all three messages. With RabbitMQ, on the other hand, each consumer will receive one of the three messages.

So Kafka and RabbitMQ both support doing this the other way around as well, but it requires a little bit more setup and isn't quite as scalable. So the Kafka pattern of fanning out is really useful for things where we have events coming in and we want to distribute them to multiple independent services. So for example, we could have a stream of events coming in and we want to send those out to logging.

We want to do some data analytics on those events and we want to send real-time updates to our users. Each event that's coming in needs to be processed by each one of these services. However, the individual job of these services is relatively small and likely doesn't need to be scaled horizontally too much. This means that most of our messages are simply fanning out to each of these three consumers.

RabbitMQ, on the other hand, is great for situations where we have messages coming in, and we need to process those messages. So in this example, we have two processors, and any message that comes in needs to be read by one of these two processors. We could expand these processors out to hundreds of replicas, and we wouldn't really have any problems here. because RabbitMQ will handle distributing each message to exactly one of those processors. If you want to see some more examples of how distributed queues are used in real-world situations, you should check out our Systems End-to-End course on interviewpen.com, where we have a ton of in-depth examples of how these systems are used.

All right, so the next thing we're going to take a look at here is message routing. So with Kafka, all message routing is handled by the producer. So a Kafka setup can consist of multiple queues organized into topics and partitions, and the producer is solely responsible for determining which queue its data goes into. This has the advantage of the Kafka cluster itself not having to do a lot of work, which is part of what allows it to handle such high throughput.

On the producer side, we're able to send messages to one or many queues based on properties of that message. And if we have a situation where we don't want to fan out, and we instead want to have multiple partitions and multiple consumers that each get one message, we can do that as well on the producer side by having the producer hash the message to determine which partition it goes into. This also enables us to scale our Kafka cluster better because there's no single point where all messages have to go through to be routed. One of the problems with this, however, is that we have no control after the message has been produced of where it actually goes.

RabbitMQ, on the other hand, introduces exchanges, which take in all of the messages and route them to different queues. This exchange can do things like routing a message to one of two queues based on properties of that queue, and it can also handle duplicating messages between multiple queues to enable a fan-out style approach. What's nice about this is our consumers now have control over what messages they're consuming from this queue.

So in a situation where we're not fanning out, this enables us to balance the load between multiple consumers much better, especially for tasks that might have variable degrees of time associated with them. So let's look at what this actually means for whether or not you should use Kafka or RabbitMQ. Kafka is designed to take in uniform messages that require a short time to process. So these are things like streaming events where we want the data to go to multiple places at once and decouple systems. but there's not a huge cost associated with actually processing a single message.

Kafka does well when messages are being fanned out to many systems. So if we have multiple independent systems that are looking at the same pieces of data, Kafka's a good fit. Kafka is also very, very fast.

So if we really need extremely high throughput, Kafka is going to be a necessity. Now, traditional message queues like RabbitMQ, on the other hand, are great for long running tasks that we don't know how long they're going to take to complete. Traditional queues can also handle complex routing really well, and this can be useful for certain... in situations. Traditional queues are also really good with sporadic or bursty data flow.

Kafka is designed to have consistent data moving through the system at all times, and the traditional message queue model tends to work well when we don't have that. Now the final piece that we're going to talk about here is acknowledgement. So if something goes wrong and a consumer fails to process a message, we need some way to be able to retry and send that message off to a different consumer.

So that's where acknowledgements come into play. So with the Kafka model, we don't actually have acknowledgements, we instead have offsets. So Kafka logs an offset of how many messages each consumer has received so far, and whenever a consumer needs new data, it fetches data from the queue from that offset, so it's getting all new data from the last position it read from. Once it's done processing that batch of data, it then commits its offsets to tell Kafka that it actually successfully processed that data.

If a consumer disconnects before it commits its offsets, Kafka will be able to automatically send that data to another consumer, because that consumer will just pick up from the last committed offset. RabbitMQ, on the other hand, actually does have acknowledgements. So a consumer is just going to pull the queue for new data, and when it finishes processing that data, it sends that acknowledgement back to the queue to tell it that it successfully processed that record.

RabbitMQ will actually just wait for an acknowledgement, and if it doesn't receive one in a certain period of time, then it'll go and send that data out to another consumer. So these two approaches accomplish a similar goal, but the traditional RabbitMQ model tends to work better when we have long-running tasks, and we need to acknowledge those tasks as completed or failed. The model of committing offsets with Kafka is great when we have to process batches of data when we have a large quantity of small events coming in.

A lot of systems that people use Kafka for can also tolerate messages being dropped to some extent. So to recap, let's take a look at some use cases and whether or not you would use Kafka or RabbitMQ for those use cases. So Kafka, being a stream processing system, is really good at stream data analysis. So if we have a stream of real-time data coming in and we want to do some analysis on that data, as it comes in... there's a lot of tools that are able to do that using Kafka.

Kafka is also great for the event bus model, where we have events coming in from different pieces of a system, and we want to send those events out to multiple different independent systems that all want to capture that data. Kafka is also used a lot for logging, where we have a stream of logs coming into our system, and we want to capture those logs in a database, and maybe do some other processing on them in different systems. This is a good example of where we have a consistent stream of a lot of small data coming in.

And finally, Kafka is good at real-time communication, so for when we want to stream events to our users in real-time. Traditional queues are great for when we have messages that actually do need to be queued, so for example, a job worker system. So we might have a cluster of a ton of different workers that are all processing jobs, and we want to queue those messages and allow the workers to process them one at a time as they're ready.

Traditional queues are also great for decoupling microservices when we just need to have simple communication between two different services. RabbitMQ will handle all of the error handling and scaling challenges associated with that behind the scenes. If you enjoyed this video, you can find more content like this on interviewpen.com.

We have tons of more in-depth system design and data structures and algorithms content for any skill level, along with a full coding environment and an AI teaching assistant. You can also join our Discord, where we're always available to answer any questions you might have. If you or a friend wants to master the fundamentals of software engineering, check us out at interviewpen.com.

Transcript for:Comparing Kafka and RabbitMQ Systems

Transcript for:
Comparing Kafka and RabbitMQ Systems