Transcript for:
AWS Messaging Services Overview: SQS vs SNS vs EventBridge

hello everyone in this video i'm going to talk to you about the difference between sqs sns and eventbridge these are three message processing services from aws that are quite similar to one another at first glance however these three do very different things and are meant to solve some very different problems so if you're confused about the differences between these three then this is the video for you now in terms of the agenda of this video we're just going to be talking about a couple things so the first is what is sqs sns and eventbridge i'm going to go through these three things one by one and i'm going to explain them through examples then secondly i'm just going to tell you when you should use what so just a feature comparison between these three things and give you some advice on when you should use one over the other all right so that's the agenda let's just jump right into sqs and talk about what it is so sqs stands for simple queue service it's a very very old service one of the original services from aws and the reason for that is because it's very hard to build an enterprise scale service oriented architecture unless you have a reliable asynchronous communication service and that is exactly what sqs is it allows application owners to publish messages to a queue and in doing so decouple their applications from one another now there's a couple main concepts that you need to know about in terms of sqs so let's talk about them now so the first one is cues i've kind of briefly touched on that just a moment ago the second one is messages and the third one is polling so let's walk through these one by one now cues are the first class citizens of sqs it's what you as a user is going to create either directly through the console through the cli or using infrastructure as code cues are the destination where you as an application owner are going to be publishing messages to it's kind of like a temporary holding pool where you can publish a message to i'm kind of getting ahead of myself here but where you can publish a message to and have messages sit in that queue and be processed at a later point in time so that's the main idea of cues they're just temporary holding pools they can be like a traditional queue where it's first in first out but order is not guaranteed unless specified during the configuration step of creating the queue and then messages well messages can be just a raw json blob they can contain any data that you want there are some size caps though so you can't go too crazy with it but just messages can be any blob that you want to put in there now polling polling is the mechanism by which a subscriber or a person that wants to receive the messages will process them so applications that want to retrieve messages from the queue will pull that cue so that they can process them successfully so that's a little bit of the theory let's just run straight into an example now and the kind of example ecosystem or the example topic that i want to talk about is in the context of orders so orders as in a e-commerce application and say for example we have something called order service this is going to be the primary service that's responsible for everything about orders so storing them in its databases and notifying subscribers or notifying third parties whenever any order gets placed updated shipped delivered so on and so forth now on the other side of that we may have something like an analytics service and in this case i suppose it would be an orders analytics service so the analytics service is interested in whenever orders get created updated uh delivered basically any event change that occurs on that order the analytics service wants to know about it now you may ask yourself why can't analytics service look directly at the orders database that is certainly possible that is but that is a grave violation of separation of concerns a analytic service should never be looking at a tier one service that's meant to provide business critical functionality so that is a big no-no from a service-oriented architecture or a decoupling perspective all right so we have these two different services set up let's assume for a second we have a person they're going to be invoking a order api to maybe create an order update the order maybe change the item count or something like that so when the order service gets a order api call it needs to notify in some way or another that analytics service that analytics lambda function that we have over on the right there so how does it actually do that well in order to do that using an sqsq well first of all we need to create a queue and that's what i have kind of demonstrated visually in front of you here now i called this the orders analytics queue it's important to note that the queues are typically owned by the person that is polling or the application that is polling the queue it's not necessarily owned by the person that is publishing messages to the queue that's just a general pro tip um in terms of separation of concerns now in terms of like how do messages actually enter this queue and how do they get processed well the order service in response to that order api will call the send message api on that queue and a message will be delivered to that queue now perhaps a message looks a little bit something like this when the order is initially placed so we have customer id we have order state which is currently pending we have an array list of items and then we have the total amount for the order so at this time nothing is going to happen this message is not going to be processed it's just going to be sitting in the queue then maybe there's another event it may not be directly from the consumer in this case it may just be some event that's taking place in the real world with regards to this order so another message gets published to the queue from the order service and that one may indicate that the order is now shipped and then again the same thing may occur now we have a third message in there and then now that order is then delivered so this may take place uh you know over hours maybe even days but the point is unless someone is pulling the queue to receive and process those messages they're just going to sit there pretty much indefinitely until someone does that now it turns out that when you integrate lambda functions with an sqsq it'll automatically pull your sqsq for you so this is a nice little value add of using sqs with lambda functions a lot of the complexity involved with polling uh receiving and processing messages is completely handled for you but i kind of glossed over that detail and i ignored it but let's assume that this queue was just sitting here with these messages and then finally after some point in time we connected that lambda function to it so after you connect the subscriber to it you probably should have done this prior to sending the messages to the queue but the lambda function will in turn pull the queue and by pulling the queue it's essentially asking sqs are there any messages available in this queue then i can attempt to process you can do this either in one by one fashion so ask for a message get one back or you can do what's called batch message processing that allows you to pull a whole bunch down at once process them all at once and then delete them all at once so when you pull a message that message gets effectively hidden from other threads that are also trying to pull that message at the same time so for example with a lambda function there may be four or five six maybe even more threads that are all saying give me messages give me messages at the same time to the order analytics queue duplicate message processing is prevented using that visibility function as i just described okay so now land is asking for messages that are in this queue your sqsq is just going to start giving messages back so it'll give the first message back to the queue uh it'll process it and then it'll move on give the second one back it'll process and move on give the third one back now you may notice there that the order in which the messages were pulled off of the queue or processed from the queue was not in the correct order not in the order of insertion and that is because by default sqs does not respect insertion order in order to get it to respect insertion order and get true first in first out or fifo you need to configure your queue to be fifo when you initially create it so that's something that you should really be aware of you can't change it after the fact either to fifo or from fifo to regular now keep in mind if you go with fifo it does everything that a normal sqsq does except that there is a limit on the amount that you can publish to the queue i believe it's a couple hundred transactions per second but for most of you that probably won't matter but this is the general idea of sqs it's essentially a temporary holding area for messages that can be processed later gives your pollers or your consumers of the messages a lot of freedom in terms of how fast they want to process those messages and typically it's one to one you know you have uh one system over here that's publishing messages and only one system over here that is receiving or actually processing those messages so that's a common pattern that you'll see with sqs all right so let's just remove that and move on to sns now all right so sns sns stands for simple notification service and again a very very old service from aws very close to age of sqs but it does something quite different even though they have a very similar name to one another now the main entities are the main concepts that you need to know about is topics messages and publish subscribe sometimes called pub sub now topics are the first class citizen of sns similar to cues except that topics aren't holding pools for messages they're essentially created that have a particular theme in mind think maybe like orders topic or transactions topic or login topic basically anytime an event occurs in an application the application that owns that data will publish a message to the topic and that topic will deliver an identical copy of that message to all of the different subscribers of that topic so if we have an orders topic and many services are interested in receiving updates on orders we will subscribe all those services to that topic so they can be notified whenever an event occurs and process that event at their own pace and of course there's messages messages are pretty much the same they do have a size limit as well similar to what we saw in sqs they can just be your you know vanilla json blobs that you can put whatever arbitrary data you want in there and then third there's that notion of publish slash subscribe aka pubs up publish occurs on the application owner or the owner of the data side and subscriptions occur on the receivers of the data so whoever wants to know about an event so those are the critical kind of concepts at play in terms of sns so let's do that exact same exercise that we did with sqs and we'll just modify it a bit to fit the world of sns so again we have this notion of order service and this time we're spicing it up a little bit we have multiple different services on the other end here so we have accounting service we have analytic service and we have order dashboard service so before we go into what sns does like you can very easily solve this same problem using sqs right if you created a couple different cues well let's talk about what would happen if you just created one queue here and you subscribed all three services to this one queue what would happen there well you know your order service would publish a message to that queue and then all three of these uh other services would try to pull that queue and one of them will win one of those services will get that message first and claim it and process it so that essentially means that only one of the three services that want to know about order events are going to be notified of order events not very good alternatively you can have three separate queues you know one for accounting service one for analytic service one for order dashboard service and then you can wire it up so that order service will publish now to each of these three different cues and these guys can just pull their own cues and move on onto their merry way however it introduces a new problem now order service needs to publish to three different cues every time an order takes place not very scalable and it introduces some very interesting partial failure scenarios for example what if you publish the first one publish to the second one and then fail for the third one now order dashboard service is is out of luck not very good architecture design so in order to solve this we need to leverage sqs here and this is really where it shines here that fan out functionality of sqs so same kind of world we have that order api and we are going to instead of sending messages to the queue we are going to publish messages to our topic and publish messages you know we're going to have the same payload so in this case it's going to be pending and this is where our topic comes into play so it looks very similar not a holding pool so messages are kind of ephemeral they enter the topic and exit the topic very very quickly but what will happen is when a message gets published like this one with the pending state that i have with the black background there that message will get published to the topic and then sns will automatically as long as you have the subscription set up so you have a subscription for the topic for the accounting service to the orders topic for the analytics service to the orders topic and the order dashboard service to the orders topic you have subscriptions for the topic to the service in all three examples there so a message will get published that message will be temporarily inside of the topic and then sns will deliver an identical copy of that message to all three services all right so after it does that basically that disappears the message disappears and everything just moves on now the second message comes in or the the second state update comes in now it's shipped again we publish a message to the topic and then we deliver an identical copy to all three recipients same story again and again and then of course we have the delivered event same story happens we get the message there and it is again delivered to all the respective services now i do want to point out though that generally it's not a good idea to have an s directly subscribed to an endpoint such as an http endpoint for a service or even directly to a lambda function the reason is if you imagine a scenario where maybe accounting service has a bad deployment and the service goes down now what happens for all the messages that are being published to the orders topic while accounting service is down well it's down right it's not going to get those messages you're going to have to rely on the retry mechanism that's built into sqs to continuously retry to publish those messages to accounting service if this thing has some kind of prolonged outage for many many hours or even days or who knows how long they may lose some data which is in a lot of cases unacceptable for tier one services so how people typically use sns in this kind of fashion is instead of delivering directly to the services here they'll put a queue in front of the services so the queues are owned by each of the receiving message services so accounting service will have its own queue the lambda analytics service will have its own the order dashboard service will also have its own this way the messages will still be delivered from the topic to the queue and in the case where you know the accounting service goes down even though it's not going to be processing messages whenever it comes back up all of those messages are still going to be delivered to the queue so as soon as it comes back up it can catch right up and resume where it left off so this is the main idea of sns it is enormous scale fan out you can have many many many different subscribers millions of subscribers and you can have very very high throughput to the topic very loose limits on the amount of throughput that you can have in terms of publishing to that topic very widely used in a lot of serverless architectures to decouple applications from one another because if you think about it now order service it doesn't need to know that behind the scenes accounting service analytics service and order dashboard service all have a dependency on it it doesn't know about all this stuff that's happening behind the scenes and it doesn't need to know like order service is what is called a higher order service an order service will never need to know about an analytic service unless you have some very bizarre use case so you can create this very interesting mode of decoupling so that these higher order services don't have to care about what happens downstream of them that's really where the benefits of sns come to shine all right so that's it for sns let's talk about eventbridge now so eventbridge is kind of like a newer kit on the block it was released i believe just a couple years ago but it's very similar to sns it has some improvements on sns and it does but it does things a little bit differently so the main concepts of eventbridge there's four of them here so first one is message bus second one is events third one is rules and the last one is targets so message buses are basically the same thing as a topic it's the same idea you publish messages to the event bus and you have different kind of recipients of those messages secondly we have events events are kind of what we were just looking at as an example like a pending or shipped or delivery event events can be constructed either by an application such as order service can also be emitted by a aws service itself maybe something like ec2 for example whenever an instance gets spun up you can integrate ec2 with your eventbus so that events automatically get generated whenever these things occur and then thirdly you can also integrate it with other third-party sas or software as a service providers things like you know datadog and pagerduty and shopify all those kinds of services and there's a whole bunch of them that have direct integrations with eventbridge so instead of having to like write your own custom code they will automatically provide that kind of functionality for you third we have rules and rules basically just match incoming events and sends them to their corresponding targets for processing and then targets are just your destination endpoints they are what is going to be invoked at the end of the day or the subscribers in sns lingo for eventbridge now they do have some other interesting features as well such as message filtering they allow you to filter messages so that for example if you only care about a subset of messages say for instance you only care about delivered messages from our previous example you can have a target that will only receive delivered messages and in fact you have a very similar functionality in the world of sns it's called sns subscription filters so that functionality isn't unique to eventbridge it's also present in sns but just keep that in mind that is a offering or that is a feature of eventbridge now in terms of the kind of architecture diagram well it turns out that the architecture diagram is pretty much exactly the same as what we just saw for sns so i didn't want to insult your intelligence and walk through this step by step it is effectively the exact same thing the only difference is that we have the orders message bus instead of the sns topic like we had before so at first glance these things like they're pretty similar right sns and sqs they seem to do the exact same thing you would be mostly right the main difference that i would say is that with eventbridge the big appealing or kind of glossy shimmery shiny stuff that people are being attracted to is that third-party integration or that service integration that you can get for free so integrating with things like shopify pagerduty datadog automatically without having to write that custom code super super handy but essentially it uses the same principles you have a message bus which is basically a container for your events you publish events to your message bus you have rules that are defined that specify which targets receive the messages and then at the end you have targets which are the actual recipients of those messages so this can be configured in a one-to-many way or a many-to-many way there's a lot of freedom that you can do with message buses or a lot of interesting things that you can do with it i should say rather but it is a newer service there are some limits that you should be aware of though that make this a non-starter for me now the biggest limit the biggest problem with using eventbridge is the fact that for a specific rule you can have a maximum of five targets i'll say that again for a specific rule you can have a maximum of five targets that means that if you want you know five people that are interested in just delivered events you can only have five of them this isn't good it's not going to work in real life because in most applications well a lot of large enterprise grade applications you have many many subscribers you never want to limit yourself to any small number five is just inconceivably low in my opinion so in my honest opinion here this makes eventbridge a non-starter if anyone is listening from eventbridge please make this better five just is not practical in any way so that's what eventbridge is very similar to sns very similar concepts i'm sure it's probably even built on top of sns to be perfectly honest with you all right so that's what these three services are let's just go to the very quick when to use what summary kind of section just briefly highlighting we have sqs sns and eventbridge sqs is great for reliable one-to-one asynchronous communication between microservices it's very durable you can temporarily hold those messages in a holding pool and it basically allows for back pressure so that people that are subscribed to the sqsq they don't have to burn themselves out trying to process messages too fast they can do so at their own pace a really really highly overlooked benefit of using sqs and it does support ordered message processing if you wish but just remember that you need to set that up when you initially create your queue you can't modify it after the fact now in terms of sns well it's great for one-to-many fan out fanned out in the sense that you publish a message once and a copy of that message is fanned out to many different subscribers that's the idea of fan out it's great for very high throughput applications and it's also great for applications that require many many subscribers that can potentially be interested in a single event and then for eventbridge it's one to many with limitations like we discussed before that five target per rule limitation is a little bit of an issue can also be configured to be many to many if you want to get creative with it i don't suggest that though the main reason that i think people should use eventbridge is for those service integrations such as other sas services or custom application integrations like we were doing in our example with order service so i hope you enjoyed this video if you want to learn more about application orchestration on aws check out the playlist on the right and please don't forget to like and subscribe i'll see you next time