[Music] hello everyone welcome to this lecture the outline for this lecture is as follows in this lecture we will discuss what is secure multi-party computation secure mpc is we will see some real world applications we will see various dimensions in which one can study the secure mpc problem and since this course is all about studying mpc tolerating malicious adversaries we will also discuss the challenges in dealing with malicious adversaries ok so let us start with the motivation behind secure multi-party computation so there are several applications which are distributed applications and in such applications we require privacy preserving information processing namely in such applications we have several entities in fact we call them as mutually distrusting entities say we have n number of entities p 1 p 2 p i p n and they are mutually distrusting in the sense they do not trust each other and each entity has some confidential data associated with the entity okay so for instance p 1 has the data x 1 p 2 has the data x 2 p i has the data x i and p n has the data x n so the data could be in any kind of abstract data it could be for instance they are personal details or it could be their salary information or it could be their biometric details or it could be some huge database so we do not want to go into the exact details of the data but we abstract it out at each entity here has some private or confidential data even though the entities are mutually distrusting still they would like to perform some joint computation on their data without revealing their inputs so the computation which they are interested to perform can be abstracted by some function f which is publicly known and this function is an n array function that means it takes n inputs where the ith input is going to be provided by the ayat party and the parties want to deserve we run a car run some protocol execute some protocol which allows them to learn the outcome of this function f on the input x 1 to x n but in the process no party should learn anything additional beyond what it can infer from its own input and the function output okay so this problem is also called as secure multiparty computation or mpc problem why mpc why secure multi-party computation because we have multiple parties involved here and together they would like to perform some computation in a secure way we often call this problem as computing on private data because we have several entities involved here each of these entities has some private data and we would like to perform some computation involving the sensitive data of all the entities so if you are wondering that we can solve this problem by asking or by letting each entity to encrypt its data x i so p 1 can encrypt its data x 1 and make it available to everyone p 2 similarly can encrypt its data and make it available to everyone p i encrypts its data x i and make it to available to everyone and p n encrypts its data and make it available to everyone well ah that ensures that the data remains ah confidential in the sense that as soon as i encrypt the data it becomes a jargon for everyone outside it it becomes a jargon for everyone else no one can figure out what exactly the contents which are encrypted in that encrypted data but then you cannot perform the computation or you cannot compute the value of the function f on the encrypted data right so in some sense encryption solves the problem of confidentiality it ensures that the data remains confidential but it does not make the data available so that computation can be carried out so it looks like that we are now trying to solve a problem secure mpc problem which has two con conflicting goals okay it has two conflicting goals namely we want our data to be available because only when the data is available one can perform any kind of computation on the data but at the same time we want that the data should remain confidential and this two looks like some conflicting goals and it might look that this problem is not possible to be solved but in this course we will see ah lots of protocols algorithms how to solve this problem an analogy that we can consider here is that of jewels and locker so if you have if you can if you imagine that your sensitive data is some kind of jewels precious jewels which you do not want to be stolen then you can preserve them by locking in a locker but at the same time you would like to wear them for some occasions right so now let us see various real world applications examples which are apps which can be abstracted by this generic problem of secure multi-party competition so the first problem is privacy preserving data mining so consider a scenario where we have several hospitals right so in this example i am taking three hospitals and each hospital has its own private medical record okay namely the patient database and medical record medical database is a very sensitive information it cannot be disclosed in public domain there are various legal implications if any hospital tries to do that but there might be a scenario where say all the hospitals would like to run a protocol to perform some kind of data mining operation across the databases of all the hospitals so for instance they might want to find out or they might want to compute statistics like how many patients are suffering from aids in total or is there any common patient registered for some specific disease in all the hospitals and like that varieties of variety of other interesting statistics of course they can compute the statistics if each hospital makes it makes its public makes its medical record available in the public domain but as i said there are various legal implications regarding that so now the question is is it possible to perform any kind of data mining operations in a privacy preserving way ok so now let us see how this problem can be obstructed by the problem of secure multi party computation so here the parties are nothing but my hospitals okay and the private data associated with every party is nothing but the medical records or medical database ok and the function f which the parties would like to compute ah corresponds to the data mining operation or the query which the parties would like to run on the join database right consider the problem of secure e-auction where we have say n bidders and they are bidding for some valuable object and we would like that only the highest bid should be revealed during the bidding process and no bidder should learn anything additional about the individual bits of course the losing bidders will learn whether the value of their bid is less than the winner or not that's allowed to be revealed right remember the problem description is that if i am the ith party then based on my own input and the function output whatever i am allowed to learn that is fine but nothing additional should be learned so if i am the ith bidder and if i am not selected as the winner in the secure e-auction protocol then i know that the value of my bid is definitely less than the winning bit okay that is allowed to be revealed that is not a breach of information okay so again you can see that how the secure mpc problem abstracts this application okay so the parties are nothing here but the bidders parties correspond to the bidders and x i's correspond to the private bits and the function f corresponds to the max function ok consider another interesting problem the yau's millennials problem and in fact this is the first use case of mpc the problem of mpc was introduced by during award winner andrew yao in his seminal paper and in this in his paper he motivated the problem with this example he called this problem as yao's millionaires problem so imagine we have two millionaires who are not in talking terms with each other but still they are interested to find out who is richer uh among them without disclosing the exact value of the asset okay so this is a special k this is this problem is nothing but a special case of secure two pc two party computation because here we have only two parties involved okay of course you can imagine that there are n number of millionaires where n is more than two and they are interested to find out who is ritual between them that's something similar to our e-auction problem now consider this very interesting application of secure multiparty computation namely how we can ensure that satellite collision does not happen but that do in a privacy preserving fashion so imagine we have two countries who are not in talking terms with each other and they launch their spy satellites ok so since they are launching the spy satellite the trajectory information of their individual satellites will be a secret information okay because it is a spy satellite if it is not a secret information then the other country can of course go and destroy the satellite of the other country so that is why they would like to launch their respective spy satellites in a secret way where the trajectory information is kept hidden but since both the countries are launching their individual satellites without knowing each other's trajectory information it is quite possible that collision occurs between the two satellites and if you search in the google you can find out that several such collisions have been reported in the past so it's not some fancy example and if any such collision occurs it cause it its actually a loss of what billions and more than that the debris which are left over in the space is a serious issue so now the question is is it possible for these two countries to find out what's the probability or what's the chances of collision without disclosing their individual trajectory information of course if both the countries disclose their trajectory information to each other then they can find out whether there is a possibility of collision or not but trajectory information is the secur is the secret information the secret data available with the respective parties which they do not want to disclose so again you can see that how this problem very nicely fits into the paradigm of secure multiparty computation we have n equal to two parties and their x i information is nothing but their individual trajectory information and the function that they are interested to compute is whether what's the probability of collision okay so like this there are several real world applications several real world problems which can be very nicely abstracted by the problem of secure multiparty computation and this problem is like the holy grail problem of secure distributed computing it's one of the most fundamental problems in secure distributed computing and it has been studied enormously over the last four decades in various dimensions so let us list down the various dimensions in which one can study the secure mpc problem ah dimension number one ah is defined based on how we abstract the function to be securely computed so there are two popular ah ways to abstract out the function ah which the parties would like to securely compute either we assume that the function is represented as a boolean circuit consisting of your logical gates say the nand gates nor gates and so on or we assume that the function to be securely computed is abstracted or represented by some arithmetic circuit over some appropriate algebraic structure it could be a group it could be a ring it could be a field and so on dimension 2 is based on what kind of communication network is available among the parties so whenever we want to design a secure mpc protocol according to the protocol the parties have to exchange messages and those messages will be exchanged over the underlying network now there are two ways two types of communication models which are considered in the literature the first model is the synchronous model synchronous communication model where we assume that the communication channels among the parties are synchronized through some global clock that's why and and that implies that if i am one of the sender parties and if i am supposed to send some message according to the protocol then the receiving party will know within what time that message is supposed to be received by that receiving party right so for instance that channel delay could be some minutes could it could be some order of minute it could be some order of hours and it could be order of days but whatever irrespective of the case the channel delay will be publicly known and as a result of that if any expected message does not ah is not is not received by a recipient in the protocol execution then the recipient can always label the sender party as a corrupt sender party okay a more practical communication model is the asynchronous communication model we are do not where we do not make any such assumption regarding the channel delays that means if i send any message as a party during the protocol execution then the only guarantee will be that it will be delivered eventually but no one will be knowing when it's going to be delivered okay moreover the order in which i have sent the messages uh that order need not be preserved when the messages are delivered to the corresponding recipient it might be reordered or it might happen that whatever i send tomorrow that get delivered and whatever i have sent today that get delayed by some arbitrary amount but it will be eventually delivered right and this asynchronous communication model ah very nicely captures with nicely models your real world networks like the internet where on there where no one can give you an upper bound strict upper bound on the maximum delay within which any message will be delivered to the corresponding receiving party right however ah designing asynchronous protocols is very challenging compared to designing synchronous protocols the third dimension is based on the corruption capacity so remember we are assuming that the parties are mutual it is trusting and to model the distrust in the system we assume that we have a centralized adversary right who can capture or who can control a subset of the parties now there are two ways by which we can model the way it can or by there are two ways by which we can model the subset or the cardinality of the subset of the parties which can be controlled by the adversary in the threshold model we assume that there is some publicly known threshold say a t and adversary can corrupt at most t out of n parties they could be any subset of tea parties the exact identity of those tea parties will not be known before the starting of the protocol right but everyone will be knowing that during the protocol execution at most t entities can get corrupt by some adversary whereas a non-threshold adversary is a more generalized form of adversary where we model the corruption capacity through an adversary structure which is a collection of subsets of potentially corrupted subsets of parties right that means we will have several subsets given each of this is a subset of your n parties of various cardinalities and during the protocol execution any of these subsets could be chosen by the adversary for corruption right so that is called non threshold adversary dimension 4 is the corruption power that means how much resources is available with adversaries so here we have two kinds of adversaries which are considered computationally bounded adversaries and computationally unbounded adversaries computationally bounded adversary means their running time is bounded by some polynomial function in some appropriate security parameter and that allows you to use cryptographic tools modern cryptographic tools like signature schemes in public encryption scheme symmetric encryption schemes and so on whereas the other other extreme is the unbounded adversary model adversarial model where we do not put any restriction whatsoever on the computing resources or the running time of the adversary and as a result in such if you are designing a protocol tolerating computationally unbounded adversaries then you cannot use modern cryptographic tools because the security of mall modern cryptographic tools is based on ah so called hardness assumptions so in some sense this latter class of protocols allows you gives it provides you everlasting security because even if some computational assumptions are broken they no longer hold in future then still it will be ensured that a protocol which is secure against an unbounded adversary remains secure the fifth dimension is the type of corruption so here we have three categories passive corruption crash corruption and active corruption passive corruption means adversary only eavesdrop the state of the corrupt parties but it cannot force the corrupt parties to behave arbitrarily during the protocol execution that means whatever instructions are assigned as per the protocol to a corrupt party the corrupt party will follow that instructions crash failure is a slightly powerful form of corruption and active corruption is the most powerful form of corruption so we will discuss about these three types of corruptions very soon and the sixth dimension is based on the corruption time whether the adversary decides this set of corrupt parties at the beginning of the protocol itself such kind of adversaries are called static corruptions or we can have a more powerful adversary adaptive adversary which adaptively corrupts the parties as the protocol proceeds depending upon how much information adversary has already seen during the protocol execution so in my existing nptel course title secure computation part 1 which is available at this link you can find out a more detailed discussion regarding the various dimensions in which we can study the secure mpc problem and the focus of the part one of the course was completely on passive corruptions now this course is all about active and malicious corruptions which is the more powerful form of corruption so let us go and see in more detail the difference between the types of the corruptions so as i said there are three types of corruptions the most the simplest form of corruption is the passive or the semi honest corruption where adversary is a eavesdrop adversary is an eavesdropper and it can only eavesdrop the state of the corrupt parties but it cannot influence or it cannot cause the corrupt parties to deviate from the protocol instructions a slightly powerful form of corruption is the crash corruption or fail stop corruption where the adversary has the capability to eavesdrop plus it can cause the corrupt parties and make and stop them from functioning at any time during the protocol execution right and once an adversary corrupts the parties and stop it from functioning then from that point onward for the rest of the protocol execution the party will be completely inactive ok it cannot suddenly become active after some time so that is the crash adversary but i stress that as long as that party is alive during the protocol execution it will behave like a passive it will behave like it as if it is under the control of a passive adversary that means it cannot deviate from the protocol instructions the most powerful form of the corruption is the active corruption which is often called as malicious corruption or byzantine corruption and here the adversary is very powerful it can completely control the state of the corrupt parties okay that means it can not only just eavesdrop the state of the corrupt parties it can cause it them to crash as well at any time during the protocol execution and it can also cause the corrupt parties to start sending arbitrary values completely deviating from the protocol instructions so for for instance if during the protocol execution a party is supposed to send some value x to everyone then a byzantine corrupt party may instead send x prime to one set of parties x double prime to another set of parties and it may not send anything to another set of parties right so that models complete completely arbitrary behavior on part of the adversary and that is why it is extremely challenging to deal with such corruptions so in part one of the course title secure computation part one we only considered passive or semi honest corruptions but in this course we will consider mpc protocols tolerating the most powerful type of corruptions namely the byzantine corruptions so to give you a feel that what are the challenges which are involved while designing protocols against byzantine corruptions let us take a very very simple task namely that of broadcast okay and let us see that how this simple task becomes so challenging as soon as i assume that during the protocol execution i might have byzantine corruptions so what is a broadcast problem here you have a designated sender party so everyone will know who is the center party and the center party has some message it could be a bit or it could be a bit string we do not care and the goal is to ensure that the center party sends its message identically to all the parties okay even if up to t parties in the system are corrupted ok and when i say t parties in the system are corrupted i mean to say that the sender could be one of those t parties well it need not be always the case right sender is one of the n parties it has some message and everyone is expecting sender to broadcast it message send the same copy of its message to everyone and everyone also knows that in the system there can be at most t corruption where t is some parameter and it could be the case that among those three parties which can be corrupted sender is also one of the entities okay so if we consider this problem in the semi honest corruption setting where we assume that the t corrupt parties could be only passively corrupted then this problem is very easy to solve so imagine this is the sender party to broadcast its message what it has to do is it has to just send a message m to all the parties that's all so i assume that there is a method there is a communication channel between this party and every other party so it can just send a copy of m to everyone else and since i am assuming that i am in the semi honest corruption model even if this party is under the control of this adversary even if the part even if the sender party is under the control of the adversary that need not be always the case i stress but even if it is under the control of an adversary since we are assuming a semi honest corruption model it will send the same message to everyone because the protocol instruction is send your message to everyone that is a one line code that's a broadcast protocol in the semi honest setting so the semi honest sender or potentially semi on a sender cannot deviate from that protocol instruction it will send the same copy of its message to everyone else but now if i consider the same protocol the same one line code send your message to everyone send m to everyone suppose this is the protocol code and now if we execute this protocol in the byzantine corruption model where the sender could be potentially corrupt then it can do the following to different parties it may send different versions of its message because it need not follow the protocol instructions because we are assuming now a byzantine corruption model as a result of this the simple one line code of sending the senders message to everyone will simply fail because now different parties will have different version versions of the senders message so now we need to do something more on top of sender simply sending its message to everyone namely we would require the parties to interact right and agree upon a version of the message sent by the center and the amount of interaction which is required to come to a conclusion to come to an agreement may vary from protocol to protocol so what i have demonstrated here is that a very simple task like sending a message identically to everyone becomes so challenging so complicated as soon as we consider byzantine corruptions because byzantine adversaries can completely force the corrupt parties to behave in a completely arbitrary fashion ok so with that i conclude this lecture so let me summarize what we have discussed in today's lecture in today's lecture we have introduced the problem of secure multi party computation we have seen various applications various real world problems which can be abstracted by this problem of secure multiparty computation and we have seen various dimensions in which we can study this problem this problem is often called as the holy grail problem of secure distributed computing we have seen dimensions different like the various ways to abstract the underlying function the way we can model the corruption capacity the nature of the adversary the timing of the adversary the computing power of the adversary and so on right so the reference for today's lecture ah is basically the week one of the part one of the course and in that week one you can refer to the lectures one two and three to know more about the various dimensions in which we can study the mpc problem thank you