in this video we're going to talk about the cap theorem my name is Kevin way and I'm here to help you land your dream job in Tech the cap theorem is an important database concept to understand in system design interviews cap stands for consistency availability and partition tolerance hence the name cap theorem consistency refers to all the users being able to see the same data at the same time availability means that the system is always available for reads or rights even with node failures and partition tolerance means that the system continues to function even if communication fails between nodes the theorem states only two of these three properties can be ensured in a distributed system before we dive deeper into the cap theorem if you're enjoying these Tech interview prep videos make sure you like subscribe and hit the notification Bell for new videos every single week it might feel meaningless but doing these things really help the algorithm find our content in system design an assumption we make is that Network partitions will happen so to offer any kind of reliable service partition tolerance is necessary so you can't choose to Forfeit the p and cap that leaves us to make a trade-off between ensuring consistency or availability let's first talk about consistency consistency is the property that after a write is sent to a database all read requests sent to any node should return that updated data imagine a scenario where there's a network partition in this case both node a and node B would reject any write request sent to them this would ensure that the state of the data of the two nodes are the same otherwise only node a would have the updated data and node B would have stale data sometimes you'll hear the term strong consistency the consistency in cap theorem does not necessarily refer to strong consistency in a strongly consistent database if data is written and then immediately read after in theory it should always return the updated data however in a distributed system network communication doesn't happen instantly since nodes are physically separated from one another and transferring data requires some amount of time this is why it's not possible to have a perfect strongly consistent distributed database in the real world when we talk about databases that prioritize consistency we usually refer to databases that are eventually consistent with a very short unnoticeable lag time between nodes so to summarize again consistency in the cap theorem means that all users would be able to see the same data at the same time now let's talk about availability in a database that prioritizes availability is acceptable to have inconsistent data across nodes where one node may contain stale data and another has the more recent data availability means that we prioritize nodes to successfully complete requests sent to them available databases also tend to have eventual consistency which means that after some amount of time with a network partition is resolved all nodes will eventually sync with one another to have the same updated data in this case node a will receive the update first and after some time node B will be updated as well now let's think when should consistency or availability be prioritized if you're working with data that needs to be up to date then you'd want to prioritize consistency on the other hand if it's okay that the query data can be slightly out of date then storing it in an available database may be a better choice let's check out some examples for a better understanding imagine you're building an app like Amazon where Shoppers can browse a catalog of products and purchase them if they're in stock you want to make sure that products are actually in stock or you'd have to refund the Shoppers for unavailable items should your database that stores product information prioritize consistency or availability here you should choose consistency in the case of a network partition where nodes can't sync with one another you'd rather not allow any Shoppers to purchase any products than have two or more Shoppers purchase the same product when there's only one item available an available database would allow for the latter and then at least one of the Shoppers would have to have their order canceled and refunded let's imagine another scenario where the same amazon-esque product let's say a PM decides that it's much more cost effective to refund Shoppers for unavailable items rather than to show them as out of stock during a network failure in this case you'd want to prioritize availability since canceling and refunding orders will be preferable to not allowing any Shoppers to purchase a product at all during a network failure now notice that only write requests were discussed here this is because read requests don't affect the state of the data and don't require re-syncing between nodes read requests are typically binary Network partitions for both consistent and available databases I hope this helped you understand the cap theorem and for more Tech interview prep videos exponent has the best resources to help you Ace or interview including in-depth interview courses private coaching and a community of experts ready to help you prep for even the toughest questions hit the Subscribe button for new videos every single week and go to try exponent.com to become a member today thanks for watching foreign [Music]