Stellar Consensus Protocol Overview

okay hello everybody welcome to this Tech Talk on the stellar consensus protocol my name is Courvoisier Austin I'm a director of engineering here at Google I want you to welcome our speaker today David masse Aires David is a professor of computer science at Stanford University and the chief science scientist at the Stella foundation David also wears a lot of cool t-shirts and so give a warm welcome to David thank you thank you so but before we get started Stanford requires me to tell you that this work is paid consulting it wasn't part of my regular Stanford University duties and responsibilities okay so if you think about banking the core job of a bank is really keeping track of customer assets and liabilities right if you walk into a Wells Fargo branch here in California and you deposit a bunch of money they're gonna put your bills in the same drawer as everybody else's bills and so they have to keep track of the fact that you know they'll use so much money and you owe them so much money and so on and typically the way they would do this is by replicating a ledger right so they've got the ledger is the most important thing the bank has so they might keep you know at least three copies of this thing around so even if there's a data center that experiences some big disaster they'll still have some know how much money is in your account but this begs a question what about interbank transfers so suppose now that somebody over here in China wants to send me a hundred and fifty dollars in California right well now the bills aren't aren't sitting in the same drawer if I go not even the same currency right and so what happens is this Chinese bank needs and the bank in California need to find some third bank a correspondent bank that maybe the Chinese bank has an account with so that they can adjust the ledger at that Bank and then that bank can do whatever it needs to do to send the money to Wells Fargo this is all very complicated and the net result is that it adds a lot of delay and it adds a lot of cost and so you end up seeing headlines like this where people end up paying sometimes huge Commission's up to like 22% just to send money around between countries and a lot of the fees that are paid to send money around are actually shouldered by the poorest people so it kind of make brings this kind of ironic comment that it turns out it's expensive to be poor right because these poor people are paying more in banking fees it's not all bad news of course if you're well if you're Western Union you managed to make five point six billion dollars moving just 85 billion dollars around in 2014 but of course those those fees are coming out of the pockets of the people who can least afford it now of course it's actually something worse than being than having to pay excessive fees to Western Union and that is you might not be able to send money around at all right so imagine that you want to send money home to your family and Somalia and it just turns out there's no way to do that and in fact that happens so you see headlines like this where people are literally unable to send money back to to family members who need it okay so what's going on here well if you think about today's financial networks they actually look a lot like email did before the internet so before we had internet mail we have things like uucp and bitnet and people used to send email addresses used to look like this there were these things called Bank paths where you had you know host one bank host two bank host three bank user and these addresses they're not globally meaningful and they had a lot of delay because the email had to go through a bunch of hops with store-and-forward and nowadays it's pretty easy it doesn't matter where you have an email account if it's you know Gmail or yahoo mail or whatever you can just send email to anybody else so wouldn't it be nice if we could send money around the internet in the same way and that's really been the that's the core focus of what we've been working on at the stellar Development Foundation and the the first observation here is that it actually would be possible to move money around like this if we had some kind of global ledger keeping track of you know who had what money right because now instead of having to find some course and bank in order to send the money from China at to California you know we could just the two banks could just make the adjustment in the global ledger that one hundred and fifty dollars had moved here and then the money could just pop out in my account here be much easier the problem of course is if you're gonna have this kind of global ledger there's no single party that would be trusted enough to manage this ledger that keeps track of you know how much money all of the banks have in this international financial Network and even kind of trying to split the cost split the trust across some kind of consortium is problematic because you know who gets more seats the u.s. or China cuz you know China has a bigger population and it's like Cuba has a seat at the table can US banks participate you can imagine all these thorny questions that would come up so so this is problematic on the other hand we do have this existence proof of a global decentralized system that works really well without any kind of centralized control and that is inter-domain routing on the internet right the Internet is this giant global network works pretty well but it's built out of these pairwise peering and translate relationships between pairs of ISPs so so what I'm going to talk about today is how we can use the same kind of pairwise trust to achieve this secure global consensus and once we have this kind of Internet's level consensus we'll use that to realize the global ledger okay now why consensus well it turns out consensus is the key to replication you know the most common way to build a replicated system is to use what's called a replicated state machine approach where every replica starts in the same initial state and then you agree on the exact sequence of operations that you're going to apply and therefore all the replicas maintain the same state right and this second one right making sure that all the replicas agree on the exact sequence of operations turns out to be the hard problem right so in this case you know maybe we want to agree that operation 7 here is this transfer of 150 dollars into my account in California and so essentially the whole talk is going to be concentrated on this question of how does the whole world agree that you know yes this 150 dollar transfer actually happened and was part of operation 7 ok so here's kind of an outline what I'll talk first I'm gonna give some back and cover Byzantine agreement but I'm going to kind of present it there's a novel twist I'm going to present it through this new lens which is viewing everything in terms of voting on what I'll call irrefutable and neutralise double statements and we'll see what that means and then I'll discuss federated Byzantine agreement which is a generalization of Byzantine agreement to a model that could actually accommodate this kind of internet level consensus then I'll talk about failure resistance and you know how well we can do in terms of tolerating failures and then I'll present my construction which is the stellar consensus protocol okay so let's get into the consensus problem the goal of the consensus problem is for multiple replicas to agree on an output value and so you have a bunch of agents in the system each of which is storing a replicas each agent starts with an input value and the input value would typically be the candidate for you know the enth operation and some replicated state machine log and then the agents are gonna exchange a bunch of messages and use those messages to agree on one of their inputs and then they're gonna output the value in this case you know v2s input was nine maybe the three agents pick nine and so then the output nine and the important point here is that the output is right once you know once you output the value nine it could cause you know ATM machines to spit out twenty dollar bills or something irreversible actions may happen so you can never change your mind you better be sure everybody agrees on a value before you output it okay so there's some properties you might want of your consensus protocol so one is that you'd like all the outputs produced to have the same value right otherwise it's not much of a consensus protocol so that's called agreement you'd also like the output value to actually equal one of the agents input values right otherwise there's a trivial solution everybody just outputs zero and it's not a very interesting protocol so together this agreement and validity which is what you call when the output actually equals one of the inputs I'll call this safety right those two properties constitute safety you might also want your protocol to guarantee that eventually all the non faulty nodes will actually output a value so that's called liveness and finally you might want your consensus protocol to provide fault tolerance meaning that you can ever from the failure of an agent at any point in the execution of a protocol and when it comes to fault tolerance we actually care about two different kinds of of both tolerance in a fail stop model you assume that agents fail by crashing and then they just never send any more messages and in the Byzantine fault tolerant model you assume that agents can fail by behaving arbitrarily so you can kind of imagine that attacker takes over an agent and sends kind of the worst possible messages with the goal of trying to mess everybody up now there's this seminal result in distributed system due to Fisher Lynch and Patterson called the FLP impossibility result which states that no deterministic consensus protocol can guarantee all three of safety liveness and fault tolerance in an asynchronous system so that's kind of a bummer since since that's what we wanted but let's try to build up an intuition for why this is so if you remember a couple of slides ago we had these three agents and they chose v2 input which was nine and they helped with the value nine right now suppose we you know we turn back the clock and we restart this protocol but right is the agents start to execute there's some kind of transient Network outage and v2 is cut off from the rest of the network so if you want to be three can talk but not v2 so at this point in an asynchronous system v1 and v3 may start to think like Oh looks like v2 may have failed right and so if the protocol is fault tolerant then v1 to v3 need to be able to complete the protocol and terminate without talking to v2 and since they don't know what v2 is input is they're gonna have to pick another input so either three or seven let's say that in this case they they pick seven and then the output seven right and at some point of course the the network may come back up and then v2 can talk to the other nodes and if the protocol is safe and provides agreement then v2 also better output seven so that it can match with the other nodes output so what you can see here is that a the output value chosen by the nodes actually depended on the network behavior you know depending on whose messages got delivered first they could pick the value nine or they could pick the value seven so when a consensus protocol is in this kind of state where the network can affect which of several values can be output will say that the protocol is in a bivalent state right you would maybe think multivalent but I'm using bivalent to be consistent with the literature here okay so conversely you could reach a point where there's only one output value possible right you know the fate of the systems been sealed seven is the only possible output value and in that case we'll say that the system is uni valent or if the only output value is seven we'll say the system is seven valence same uh in addition you kind of hope this won't happen but if you don't have a very good consensus protocol you could also end up in a situation where you get stuck and there one or more faulty nodes that can never output a value no matter what the even if the network decides to start cooperating so in that case we'll say that the protocol is stuck so if you remember from a few slides ago we said the output is right once right you can never change your mind and of course all the output values need to agree and so what that means is that it's never possible to output a value if you're in a bivalent state right because once a particular value of an output well then that should seal the state of the system any other up outputted values should should be the same so that means that if you if you have one of these systems and you have an execution that starts in a bivalent state then it eventually terminates then it must at some point have reached a uni valence state okay so now here's an intuition behind this this FLP impossibility result let's imagine that you have a terminating execution of a bivalent system right you start by Valentin and eventually everybody he'll put some value and let's let M be the last message that was received by any node in a bivalent state so receiving M is kind of what flipped the system from a bivalent to a univille in state so in that case we'll say that M is the is we'll call em the deciding message right because somehow delivering amaz with the side of the fate of the system okay now let's kind of turn back the clock and say that instead of delivering M the network had delayed m and a bunch of other messages have gotten delivered first and those messages of course might have altered the state of the system and they might have altered the state of the system in such a way that when M is eventually delivered much later than it was originally it's no longer a deciding message right and just delivering M is enough to flip over to a uni villian state so in that case we'll say that the message M has been neutralized right neutralizing a message means taking it making taking what would have been a deciding message and now making it no longer interesting or no longer able to decide the fate of the system so here's the the overview of the FLP proof right you can show that there bivalent starting configurations and then if your system is fault tolerance it turns out that the network can can neutralize any deciding message and the reason is that you know whatever nodes sent that deciding message might have failed so you can't always count on being able to receive that message so there has to be some way to kind of recover and say okay let's stop waiting for that message and do something else so that we can make some progress and pick a value and so taking together what these two things mean is that the system can essentially remain bivalent in perpetuity if every time there's about to be a deciding message the the network pathologically delay is just that one message until it's been neutralized and so you'll never end up transitioning to a univille at state okay so how do we cope given that we want to achieve consensus and a lot of systems depend on it well what you want to do is design protocols that terminate in practice even if they're not guaranteed to terminate and the most important property that you need to achieve for this is that you have to avoid stuck States right you want to make sure that whatever weird stuff is happening on the network if eventually you fix a network and things start behaving in a more friendly way that you can recover and everybody can agree on our value and output that value okay good so now let's look at a strawman attempt to to implement consensus right so let's say that you have a system with n nodes and for now to keep things simple we'll assume fail stop behavior so they might crash but they won't act arbitrarily and now we're going to pick some quorum size T where T is greater than n over two so it C includes a maturity of nodes and and we say that if if T nodes namely in other words a quorum ever all vote for the same value then it becomes safe to output that value in this example up here you can notice that that quorum a here has all voted for the value nine so at that point it becomes safe to help with the value 9 right and this guarantees agreement because the quorum size includes a majority of nodes so any two quorums will intersect in at least one node and nodes aren't allowed to change their votes so there can be at most one value that will be output now what's the problem well the reason this isn't a very good consensus protocol is because you can have stuck States right so in particular imagine that one of these nodes that was voting 9 actually failed now you no longer have T nodes voting for the value 9 some people might have heard all T votes but as some people might not have heard the vote of this node that failed because it won't be around to retransmit it's it's it's vote and so there will be nodes that that cannot actually output a value another problem you could have is you could have counter a split vote right so everybody votes for a different value and you can't change your vote then you'll never be able to output any value cuz no value will ever actually have a quorum voting for it make sense ok so this is what voting actually gives us write it with this if we try to use voting as a consensus protocol will find that we start in some bivalent state and then if we get enough votes we might end up in an a valence state for some value a that we're voting for or we can end up in some a bar valence date where a bar will let that send for any value that's not equal to a and contradicts a but if enough people vote then maybe the whole system would end up agreeing on a otherwise you can end up in this stuck state over on the right so so voting might give us what we wanted but it might put us in the stuck state that we can never recover from so what this means is you can never directly vote on the actual consensus question you care about right never take an up or down vote on you know is operation 7 a credit of 150 dollars to this bank in in California to my account right you can't vote on that because if that gets stuck you'll never be able to figure out how much money is in my account right though it won't be no more well-defined notion of so what kind of statements is it safe to vote on well first of all it would be okay to vote on a statement if that statement never gets stuck because that means you could eventually reach agreement and second of all it's okay to vote on a statement if you can somehow break the hold that that statement has on the actual consensus question that we care about right so if a statement gets stuck but that doesn't prevent me from figuring out how much money is in my account well you know then that's okay so let's look at this in a bit more detail right so first observation is that the reason that you end up in these stuck States is that nodes aren't allowed to change their vote right if everybody votes for a different value and they can't change their vote then they can't turn around and all vote for the same value or at least have a quorum vote for the same value but suppose you have a statement where for some reason you can reason about the fact that no correct node would ever vote against it or vote for anything that contradicts that statement well then you won't get stuck and so I'm going to call that kind of statement an irrefutable statement right so air/fuel statements are fine to vote on because you know nobody's prevented from voting on it by past votes so eventually you know as long as there's a quorum that quorum can can vote for the irrefutable statement the second kind of statements are statements whose hold on on the consensus question can actually be broken and if you recall from we were talking about the the FLP proof that we said that for a fault-tolerant system you need to be able to neutralize any deciding message and so we'll say that a statement is neutralized well if you can kind of transition the system to state where you no longer care about the the value of that that's statement so it turns out that most of the kind of trickiness involved in designing a consensus protocol is in the question of how do you formulate statements that are useful towards consensus but can also be neutralized if you need to okay so there are two popular ways of designing these neutralize double statements one is the ballot based approach and the other is what I'll call the view based approach the view based approach may actually be simpler you should probably use it if you can get away with it but but in this particular context I couldn't figure out how to make it work so we'll talk about ballot based neutralization so ballot based neutralization as something that was introduced by Leslie Lamport for paxos and the idea is that you vote to commit or abort ballots where the two are contradictory you can only vote one way and each ballot is a pair n comma X where n is just a counter they can keep incrementing so you have an unlimited supply of ballots and X is a candidate output value for the consensus protocol and we'll say that yeah and if I if you actually have a quorum that votes to commit and comma X then we'll say that it's safe to output the value X now in order to make this work we're gonna have to preserve an invariant and the invariant is that all committed and all stuck ballots are gonna have to have the same value X so so how are we going to preserve this invariant well the way we preserve this invariant saying you can't just vote to commit a ballot arbitrarily before you vote to commit a valid you have to prepare that ballot and what does it mean to prepare the ballot well you prepare a ballot and comment X by aborting every ballot with a lesser or equal counter and Prime that has a different value not equal to X right and because of the way the because the way it's been designed even if some ballot income X gets stuck there's nothing that prevents you from preparing n plus 1 comma X so that means you can kind of neutralize a stuck ballot and move on to the next ballot so let's maybe illustrate this with an example so here let's say we have 8 candidate values a through H and initially all the ballots in the system are are bivalent and then let's say the system kind of agrees that they've prepared one Kommetjie others they've aborted all the other ballots at that point you might vote to commit 1 comma G but you could lose that vote it could turn out that 1 comma G was aborted after all and so then you'll move on to about ballot round 2 and now maybe you try to prepare 2 comma F right so let's say you managed to prepare 2 comma F and now you vote to commit it and it might be that well you know what that vote gets stuck turns out we'll never managed to commit to come 1/2 you know oh well just bump the counter again and now let's move to 3 comma F right and now you'll prepare three comma F and vote to commit three comm F and maybe this time the vote succeeds so you've managed to commit ballot three comma F and now we're done we've chosen the value of the system we can output the value F and notice that at this point nobody cares about the status of 2 comma F anymore right that's been totally that's been that's been neutralized right whether 2 comma F is aborted or committed or we'll never know it just doesn't matter because we the question we actually care about which is should we output the value F that one has been decided and if for some reason not everybody can can see the the quorum that voted for 3 comma F because maybe some nodes failed or something that's fine just bump the counter again and now we can you know we can prepare and commit for com F so this is this is paxos this may be an unusual way to present taxes but we'll see it as a way that generalizes to the model that we need for this kind of global financial network okay so so far we've been in the fail stuff case we've been imagining that the the nodes that failed they just kind of crashed and they never output another message what happens when you have Byzantine failure right so the big difference here is that when you have a business in failure the failed node it can actually change its vote right it can give inconsistent messages to different people it can tell one person who I vote for this value and somebody else would I vote for that value so in the fail stop case in order to guarantee safety and in particular in order to guarantee agreement we needed this guarantee that any two quorums shared a node and we said well that node won't change its vote so you know therefore any two quorums that that vote for a value will vote for the same value now we need to kind of strengthen strengthen that assumption and we need to say that any two quorums need to share an on faulty node right because the faulty node can just lie and and give contradictory votes so if you have n nodes and you're quorum sizes T then the minimum overlap between any two quorums is going to be 2t minus n so what that means if you want an honest node in there then the greatest number of Byzantine failures you can tolerate is going to be 2t minus n minus 1 and I'm gonna call that value F sub s because that's the the failure tolerance with respect to safety basically now what about liveness well for liveness you actually want to make progress you actually need there to be a quorum right and so if you have n nodes in a quorum size of T that means you can tolerate up to n minus T failures and still have a quorum left over with which to make progress so I'll call that value F sub L because it's the failure tolerance for liveness right and there exists long standing solutions to the Byzantine agreement product protocol in this problem in this model and typically what people do is they set the number of nodes and to be 3 F plus 1 for some some integer F and the quorum size T to be 2f plus 1 and the reason they do that is that it gives you the kind of equilibrium point where FS equals FL which equals F in other words this kind of failure tolerance for both safety and liveness are the same but for the purposes of this talking is gonna be important to keep those those numbers different so I'll just keep referring to them as FS and FL so you know which one I mean when I talk about this okay so here again is the picture of what we get from a vote and so a question that you kind of need to answer at this point is how do we know when we're done right so if you're in the system and you see T votes for some value a then you know the systems at least a valence right you know the system's not gonna agree on some different value but you don't know that all the non faulty nodes will eventually agree on this value a or even know that the system reached an a valid state so what do we do about this well if you remember from from the previous slide we said that you know F sub s was the greatest number of nodes we could survive failing and and still be able to guarantee safety so that means if F S plus 1 nodes actually fail you've lost all safety guarantees right because now the intersection of two quorums could be like entirely malicious so suppose you have FS plus 1 nodes that they're all say the same thing they all say like we saw a quorum vote for the value a well at this point you can kind of say well either either they're all lying to me in which case they're all malicious so I've lost all safety anyway or one of them is telling the truth so I might as well assume that a quorum is voted for a without actually losing safety now suppose you hear actually FL plus FS plus 1 nodes all say the same thing they all say we saw quorum vote for a well it could be that after saying this you know FL are more than FL of those nodes died in which case the system you know we can't guarantee liveness anyway but if FL are fewer of these nodes have failed then that means that there will continue to be FS plus one of them around and those FS plus one if they're not faulty they'll continue to repeat the fact that they saw a quorum vote for a and so that'll convince everybody that a quorum voted for a so at that point all the non faulty nodes will come to accept the fact that the system reached an a valid state and so they'll know that none of them have have gotten stuck and so at that point we can say that the system has actually agreed on the statement a ok all right so so this is all good but so far we've been in this kind of tea event system where you had to know what the end nodes were to begin with so now I'm gonna talk about this new model called federated Byzantine agreement which is kind of a generalization of standard Byzantine agreement to a setting where you don't have universal agreement on who all the nodes are or how trustworthy all the nodes are and the key idea in Federated Byzantine Agreement is that we're going to pick our quorums in a decentralized way so each node in the system V is gonna pick one or more of what I call quorum slices so quorum slices are sets of nodes that that include V itself and a warm slice is a set of nodes that V believes is kind of important enough that if the whole Quorum slice agrees on something then you know that thing is is true and it's not going to be rolled back and and you can kind of think of deciding to put somebody in your quorum slice in one of your quorum slices is vaguely analogous to you know in in the intradomain routing setting of two ISPs deciding to peer decide that they you know there's some value in in in exchanging internet traffic or in this case that they some value in waiting for somebody to agree to someone before to something before considering that statement settled okay so a federated Byzantine agreement system is basically a set of nodes V and a quorum function Q where Q of V is the set of slices that were chosen by node V and so now we can define a quorum we can say the quorum U is just a set of nodes that contains at least one slice belonging to each one of its members right and this this is kind of like the the key definition in the whole talk right so I want to zoom in on this with a couple of examples to make this much more concrete so let's start with a simple example just a system with four nodes right and you have the corner each each node also to make this even simpler has only a single quorum slice and I'm representing the quorum slices by drawing arrows from a node to all the other nodes in in its quorum slice right so in this case you can see V one's quorum slices b1 b2 b3 and the other three nodes b2 b3 for our quorum slice for each other so you can see that if you look at just nodes v2 v3 and v4 that is actually a quorum right because it contains a slice belonging to each one of its members right if you look at the set v1 v2 v3 that is a quorum slice but it's not actually a quorum right because in a sense do you want to saying well I'll agree to something as long as v2 and v3 agree to it but then v2 is saying well I'll agree to something only if B 3 and B 4 agree to it so you can see v1 is never going to be able to accept any statement unless v4 agrees to it because you know v2 is not going to agree to it unless v4 somehow involved here so in fact the the smallest quorum that contains v1 is a set of all nodes in the system v1 v2 v3 v4 okay so this is kind of a simple example let's turn to maybe a more realistic example which would be a kind of a tiered quorum slice so maybe there's kind of a top tier of very important nodes and they only know about each other but they're kind of configured in a three out of four setting where each of V 1 B 2 B 3 4 you know believes that three of those nodes constitutes a quorum slice and then you have maybe a middle tier that depends not on each other but on the top tier right so if he 5 here is saying well I'll consider a quorum slice you know myself plus two of the top tier note then maybe have a leafed here that also depends on on the middle tier so this is a lot like the internet right where we have Tier one ISPs but the nice thing about the Internet is nobody had to kind of magically designate these tier 1 ISPs that the tier 1 status kind of arose out of market forces and you don't even need universal agreement over who's at tier 1 ISP occasionally you have you know is cogent to tier 1 ISP or not and you know level 3 and cogent will play chicken or something and then you know eventually market forces prevail and they have to reach some kind of agreement so so in this case let's say that the market decides that you know these are the top 4 banks in the US you know these are really like that the top tier of financial institutions so nothing nothing happens unless you know 2 out of these or 3 out of these 4 banks agrees to it but then I see if you don't need to have unanimous agreement on that so let's say v7 and v8 are extremely paranoid and they say like you know what I realize I can't get fuel a for that interacting with these big banks but I don't trust them at all I think they're gonna try to rip me off the minute they have the opportunity to do so well no problem you can have some other people come on let's say we have a couple of nonprofits here stellar the e FF of the ACLU they can also participate in this consensus protocol and maybe v7 and v8 say like not only am I gonna wait for two of the big banks I also want one of these nonprofits to sign off on on a transaction before I consider it settled and the nonprofit's are gonna you know depend on two out of three of each other in addition to depending on on the main banks so now let's look at an example suppose Citibank decides to pay a billion dollars to v7 and v7 you know hands over whatever a billion dollars worth of goods right so why would v7 do this will V suddenly do this because here's two out of the four main banks doing this and also because it you know here's one of the nonprofit's like let's say stellar but stellar is not going to agree to something unless one of the other ones does so let's say the e FF had to sign off on this too so basically if you look at the nodes with green checkmarks that constitutes a quorum and those nodes can kind of that this transaction happened and permanently place it in the log okay so now let's say that v7s paranoia was actually justified so these big three banks collude they're evil and they try to reverse this billion-dollar transaction after they've walked away with a billion dollars worth of goods and now what they want to do is they want to go to v8 and give v8 a billion dollars and also walk away with a billion dollars worth of v8s goods but what's gonna happen here VA is gonna say okay I need two out of the big banks to breathe that's okay because these evil banks they'll agree to anything right but then he also needs one of the nonprofit's so VA is well stellar in the e FF Mart gonna agree to this because they agreed to a contradictory to a different transaction which the billion dollars went somewhere else so that leaves the ACLU so v8 goes to the ACLU but the ACLU says well you know Mike warm slice also has to include one of stellar or the FF and since those people since those organizations won't agree to this the ACLU won't agree to that and so v8 is never gonna accept this billion dollars and there's never gonna turn over a billion dollars worth of goods okay so that's the model and now the question is you know how secure can we make this right so the first big difference between federated Byzantine Agreement and the standard centralized Byzantine agreement is that in the federated model failure becomes a per node thing right in the centralized Casey the whole system fails or at least you know all the non faulty nodes agree now you can have some nodes failing and and others not right and so particularly kind of divide the nodes into the ill-behaved nodes in the well-behaved notice the L behave nodes are gonna deviate from the protocol do all kinds of nasty things they've obviously failed but you know they they you know who cares because they're not implementing the protocol the well behaved nodes unfortunately you know some of them might fail too if there are too many ill-behaved nodes and two things gonna move me bad is if well behaved nodes stopped being able to make progress because they can no longer you know assemble quorum or they could they can no longer agree to new transactions you know that's like all your ATM machines shut down right what's worse is if all the well behaved knows somehow get driven into divergent state so that's now like each of your ATM machines pays out the entire somebody's account so you've lost many times the value of the deposits he had we're storing for someone and obviously if a well behaved node is is not in either of these two cases then you know we'll say that it's correct okay so so let's say that we want to guarantee safety well what's a necessary property for guaranteeing safety suppose your system is configured such that you have two quorums that are entirely disjoint right well at that point obviously the guys on the Left can vote for some of the guys on the right can vote for something completely different in fact both sides of the diagram can make progress without even exchanging any messages across the two halves of the diagram so so clearly there's no way to guarantee safety here so like traditional consensus in order to guarantee safety we need what's called quorum intersection right where quorum intersection means that any two quorums are actually gonna share at least one node so in this case you say okay well let's fix this let's throw in some you node V seven and let's make everybody depend on V 7 and great now all our quorums intersected be seven but what happens if V seven fails right what happens if you know all your quorums intersect at a and the intersection to quorums consists entirely of malicious nodes well remember these these Byzantine malicious nodes they can give inconsistent messages to different people so V seven it can say one thing to the nodes on the left something else to the nodes on the right and again it can drive the system into divergent States so in fact the property that's necessary for safety in the presence of Byzantine failures is that we need quorum intersection despite the behavior of any ill-behaved nodes what that means is that we have to basically delete all the L behave nodes from the system and from everybody's quorum slices and we want to make sure that after we've deleted all those bad nodes that we still have quorum intersection but in this example if we delete these seven things they just reverts to the diagram on the previous page and obviously there's no way to guarantee safety okay so so what about liveness so suppose you have you know some some three out of four system and you know two out of the four nodes fail right so in this case the one it includes two to quorum slices that has V 1 V 2 V 3 actually 3 slices V 1 V 2 V 3 V 1 V 3 V 4 V 1 V 2 V 4 every single one of its quorum slices actually contains a failed node right so so if there's a set of nodes that intersects every single one of some nodes these slices will say that we'll call that a V blocking set so in this case we can say the set of fail nodes is V 1 blocking because it it denies V 1 the ability to have a quorum slice of non faulty nodes so you can see that failed nodes can't be V blocking for any correct node V because if they were then V could could never make progress like this a failed nodes could just crash and never help put a message and V would never be able to take ER unanimously from one of its forums because each of its quorums would include one of its quorum slices and that quorum slice would include a failed node so and it turns out saying that you know the failed nodes aren't V blocking for any correct node that's equivalent to saying that the correct nodes have to form a quorum so the necessary property for liveness in the system is that the correct nodes actually have to form a quorum okay so to kind of summarize let's say you have a set u of well-behaved nodes in a federated business agreement system and let's say that u bar are the nodes that are not in U and we don't know those those nodes might actually be ill-behaved we can't assume otherwise so in order to guarantee safety for you we need to know that u enjoys quorum intersection despite U bar right even if you delete u bar from everybody slices you'll still have quorum intersection and if you want to guarantee both safety and liveness then you need not only quorum intersection despite U bar but you also need u itself to be a quorum meaning you bars not be blocking for any of the nodes and in U and if both of these properties hold then we'll say that that the the first set u then we'll say the nodes in set you are actually intact and in fact you can show that if your quorum if your system has quorum intersection then there's actually one maximal set of intact nodes u so that's kind of convenient because it means we can just about the intact nodes in the system okay so now the question is can we actually achieve this kind of optimal failure resistance in a federated business agreement system so what I'll talk about now is the protocol that's at the core of the stellar system that actually that actually satisfies this optimal failure resilience okay so so yeah so the stellar consensus protocol it's obviously the first general FBA protocol and it guarantees safety for well-behaved nodes that enjoy quorum intersection despite the ill-behaved nodes so what that means is if you use my protocol and then you end up in a situation where your nodes diverge and you lost safety then you can at least take comfort in the fact that no other protocol could have done any better could have guaranteed you safety under the circumstances you just had too many failures or you had you know your your quorum slices were too small or were poorly chosen also SCP is going to guarantee that intact nodes never get stuck so you can always end up making progress and the core idea in the protocol is this technique federated voting so the idea is that nodes are gonna exchange boat messages but piggybacked on the votes are going to be a specification of every nodes quorum slices so that as you're assembling votes you learn people's quorum slices and out of that you can dynamically discover quorums and decide you know when a quorum has has actually voted for something so you know so as before you know that everybody even nodes via issuing these vote messages and if a node votes for some statement a of course that has to be consistent with all of node B's past votes and any statements that that V is accepted and if you reach a point where there's a whole quorum you that every node in which every node is voted for the same statement then we said if that quorum has ratified the statement right so ratifying a statement means like an entire quorum voting for it and you can show and it's not very surprising that if the well behaved nodes in the system enjoy quorum intersection despite the Oh behave nodes that they won't ratify victory statements right makes sense intuitively because because what if when you delete all the ill-behaved nodes that are issuing multiple votes you still have quorum intersection well you know the good nodes that the intersection of any two quorums are not going to change they're both so you're not going to have two quorums you know I'm voting for contradictory statements okay so that's all well and good we still have two problems though which is you could have an intact node V that's unable to ratify some statement a even after other nodes have ratified the statement a so some nodes might get stuck while other nodes think the system's agreed on something or you could have yeah and so that could happen either because because V voted against a value and it's not allowed to change this vote or because you know some node failed after voting and V didn't didn't here the votes so effectively the outcome of federated voting looks like this now this should be a familiar diagram this is exactly the same set of outputs that we could have in the centralized case right and so the first thing you might think is well maybe we could apply the same reasoning as in centralized voting so if you remember decentralized voting we said well let's say you have the intersection of two quorums and that intersection unanimously says hey we saw a quorum go for a well then maybe other people could decide that a quorum so for a even if they themselves didn't vote for a or if they didn't care the votes so does that work here well unfortunately not right so what's the problem the problem is before we have this premise that you know the whole system failed or you know it was okay and all the non faulty nodes we're gonna we're gonna have safety right and now failure is this is this per node property right it's not it's not a system-wide property and so particular you know if you look in node V n minus one over on the right hand side here he might see like oh well the intersection of my quorum and quorum a is like unanimously telling me something but I don't really care about quorum a for all I know quorum a on the left is like some weird civil attack that was constructed just to try to get me to accept some statement that that I shouldn't accept so tough you know I can't conclude anything from the fact that these people are telling me a quorum voted for a in fact in a federated system you only care about the quorums that you you belong to yourself and so what that means that the only way to kind of know that the system's reached an a valence state is to ratify the statement a first hand you can't believe it just because somebody else tells you that so what are we gonna do about this well here's an idea suppose that there's a V blocking set that unanimously tells node V hey you know you know we saw this we saw it we know we've accepted the statement a right so either either that whole set is lying in which case node V is no longer intact so we can't guarantee it's correct or in fact you know the system did you know reach some a valid state writing these guys correctly accepted that the statement a so based on I'm not a Yale is to find this notion of accepting something we'll say that a node the accepts a statement a that's you know consistent with with v's history as long as one of two things holds either there's a quorum that includes V in which each member either you know voted for a or accepted a so that kind of includes the case where V is first hand ratifying the statement a or there's some V blocking set that unanimously claims that that they have accepted the statement a so the statement number two here is what allows a node to accept a statement even after it's voted against that statement right and in fact you can show is shown in the white paper that corresponds of this talk that intact nodes will never accept contradictory statements right so that sounds great right except we still have two problems right first of all there's this the standard liveness and stuck problem that you know there's still no guarantee that just cuz one intact node except it's something all the intact nodes will accept it and second of all there's this problem that we've actually weakened safety by by adding this second property here and you could end up in a situation where you have non intact nodes but that still ought to enjoy safety even if they don't enjoy liveness and now we have no way of guaranteeing that they won't accept contradictory statements one way to look at this by now with the central system is that what we really wanted was a set of size FS plus one to tell us something and what we have a V blocking said it's more like FL plus 1 instead of FS plus 1 ok ok so how do we fix this well the solution is actually to hold a second vote on the fact that the first vote succeeded so we'll say that a quorum confirms some statement a when that quorum ratifies the statement we accepted a write and so a node that's that's part of that quorum has has confirmed the statement so this immediately solves problem to the sub optimal safety problem because now in confirmation it is the first hand ratification right there's no weird second criterion here that involves some V walking set right you need a quorum to confirm something turns out this also solves the first problem where you have intact nodes that are unable to accept something right because if you think about it you can have intact nodes that vote against a statement that's later accepted but they won't vote against the fact that the statement was accepted the fact that the state was accepted is actually irrefutable right and if you remember that was one of the two kinds of statements that it's okay to vote on it's irrefutable statements right and in fact I show that once you have a single intact node in your system that's confirmed a statement you're guaranteed that all intact nodes will eventually confirm that statement you know once enough messages have been delivered okay so here's summarizing this this federated voting process right you start in some uncommitted state you have some statement a V think is valid you might vote for a and then if you see a quorum vote for a you'll accept a you'll hold another vote and then you'll confirm a or it could be that you voted against a that's ok you might see a V blocking set say that they've all accepted a and so at that point you'll accept they anyway and then you'll again vote to confirm it and get to the right-hand side right and once you've reached the confirmed stage you know that all the other intact nodes if you're intact will also confirm a and so you can kind of rely on a for both purposes of safety m liveness okay so how do you get from voting to consensus what turns out to be exactly the same thing as taxes right so you're gonna vote to commit an abort ballots we're gonna have the same invariances before any only all the committed and stuck ballots are gonna have to have the same X and you're gonna have to prepare a ballot before you vote to commit it there's just for convenience slightly different definition of preparer where we just totally order all the ballots rather than rather than the way it's done in paxos but it effectively works out to be the same thing now there's one interesting twist which is how do you know what values to vote for in the first place right so far this is all meant for you abstract and what you'd like is actually for the system to kind of converge on some value and then everybody votes for the same value in the reasonable likelihood and then you can just kind of finish your protocol quickly versus everybody trying to nominate different incompatible ballots and you never make progress so the other thing the other component of this protocol that all have time to get into is a decentralized nomination protocol the basically lets you select the set of values that you can then you can then combine deterministically so everybody ends up with the same value and the key insight in the nomination protocol is the fact that you can vote to nominate a value but you can't vote against nominating a value so nominating values is essentially irrefutable and that's what allows the system to kind of converge on some set of nominated values okay so this protocol SCP has a bunch of interesting properties right first of all is decentralized and then it can actually realistically accommodate this kind of like internet level consensus it's also low latency by which I don't mean microseconds but I mean compared to the amount of time you wait for credit card swipe like yeah we can do this in five seconds no problem it's got flexible trust you can have you know a small nonprofit like stellar or the e FF helping to keep the big banks honest even though they have many fewer resources and it just depends on digital signatures and hash functions so you've got kind of standardize some Taric security you tell me how powerful your adversary is and I'll tell you whether to choose you know a 256 or 512 bit hash function how big your signature keys should be etc and basically you can have zero chance of you know effectively zero chance of and sorry breaking it now obviously there's a lot of related work standard Byzantine agreement it provides the latter through properties but it's not decentralized that's why it didn't solve this problem what didn't solve this problem is proof of work and you know we've seen how powerful that is because of how much Bitcoin has taken off right but Bitcoin doesn't have the other three properties but in a big coin you can't decide you want to trust the FF you have to trust whoever has the most hashing power and then there's also proof of stake which is kind of like a you know a cross between proof of work and Byzantine agreement but it's still there you're trusting the people who have the most money you which is not necessarily you know they ACLU or whatever so it's important to point out that consensus is not the same thing as a cryptocurrency so SP doesn't offer some you know rate limited way to mint new coins it doesn't provide intrinsic incentives for good behavior right we expect people are participating in our system because their financial institutions that see value in having this you know this like global financial network but they don't get paid to do it they get paid because they're making money as banks it also doesn't tell you whom to trust so you know you could obviously configure your quorum slices in a bad way and end up with with an insecure system well that said I think the SCP has a bunch of applications beyond financial networks and one of them that I want to point out is ca accountability right so one problem is that all our browsers you know they ship with dozens and dozens of routes CAS any one of which can sign certificates you know true story Turk Trust went and you know signed in bogus certificate for google.com allowing Manning and Men in the middle of tax so people are trying to address this through efforts like certificate transparency but certificate transparency requires you to kind of trust a set of logging agents we're gonna output these logs of certificates right as you could actually increase confidence in the system by having this be more like a global log that the entire you know internet signs off on and that would make it virtually impossible to rollback a certificate that had been issued so se bika basically increase confidence in the auditing now stellar you know it's this financial network that I've been mentioning and in fact it is a system that we've deployed our initial focus is in Nigeria because I you know the banking system there is a big pain point there are many fewer people with bank accounts there and and and financial services are much less smooth a few months ago we integrated with a red in who's a vendor of of software to microfinance institutions and so that provided access to about two or three hundred microfinance institutions who can now send money over / stellar will be able to as soon as we get regulatory approval for that and finally since I know this is also going to be on on YouTube where some non-googlers may yes I want to mention that we're hiring so if you like the idea of working on cutting-edge distributed systems please contact us at stellar we've got some cool everything we're doing is open source and and it's it's pretty interesting stuff today in taking time out of your busy schedule I think we might have just a few minutes does anybody in the audience have any questions so interesting correctly at the slightest are immutable and if not how do they get reconfigured ah so that's a good question so in in traditional Byzantine agreements systems it turns out that reconfiguration is a big pain and sometimes I feel like feed it back through the system itself it turns out that that's the one area that we get that for free so you can kind of unilaterally change your quorum slice whenever you want now it's better if you do it sort of from one log entry to the next in other words if you're in the middle of the side and something you could you could change it before you move to the next question that you're deciding on but you can do it you can do it midstream it's just in terms of safety you it's a little bit weaker because you kind of have to treat it like the union of all the configurations that you had the question over there you said the messages themselves contain all your the definitional slices yeah so they contain a cryptographic hash of it and then if that hat and you have a big cache and then if you don't have that if that you know the hash is not in your cache then you just there's a separate side protocol where you say please give me the preimage of this hash value I see and and do you guys sort of who runs the service that provides each each node so like so if I hear a message from you and it includes a hash and I don't have that hash I just ask you for it so I mean it's effectively it's just an optimization conceptually it includes the whole thing it's just it would be a more bandwidth than we want to consume so sort of like little public keys and look fastest through paths maybe Thanks this is open-ended but if you've given this talk to a group of graph theorists that's curious if you've run across some good analogies or other ways of describing this as a graph theoretic problem is about creating clicks or colorings or paths between subsets certainly fully longer explanation that's been very helpful no I haven't and it would be it would be interesting I mean one thing is we have like the stellar core demon there's like a command-line switch that lets you kind of like look at that kind of the health of your quorum slices and it can give you an expense out like a few examples of like failures that might undermine the system just to kind of give you a sense but but that that mode is actually I think it's like extra it's exponential time so I think maybe there's actually some better way of doing this so it could be interesting future work there includes the talk for today as mentioned earlier it's gonna be on the YouTube channel so feel free to check it out and share with your friends and co-workers all right great thank you

Transcript for:Stellar Consensus Protocol Overview

Transcript for:
Stellar Consensus Protocol Overview