Transcript for:
Understanding Bayes' Theorem and Its Applications

Hello everyone. In this lecture we're going to talk about just one thing and that's Bayes'Theorem. And it's not anything really new probabilistically speaking or not learning any new rules. We're really just recombining what we already know. So again, it might seem interesting and new, but it's just putting together what we already know in a slightly different way. And so... It's Bayes'theorem and it's named after Mr. Bayes. And let's look at an example. Let's say there are two events. You have some disease and you test positive for the disease. So A is you have the disease. B is you test positive for the disease. And let's say we know a few facts. We know that 5% of the people have the disease, so the probability is 0.05. Now, let's look at testing positive. We know that if you have the disease, there's a pretty high chance you will test positive. You want this number to be high because you want to find everyone who has the disease, right? So let's say that's 0.99. But the higher this number is, that means we're going to have false positives too. If you increase this number, you're necessarily going to increase this number as well. So if we're pushing the test to be really sensitive, we're going to get more false positives. And let's say that 10% of the people who don't have the disease also test positive. So again, the probability of testing positive given that you have the disease, is 99% so we're getting everyone but 1% and here though there are false positives and 10% of the people who don't have the disease still test positive. So now I have a question and the question is well what's the probability of actually having the disease given that you test positive? You're tested, you come out positive, what's the probability you have the disease? Well, that information seemingly isn't here, but we can get it. And we're going to use Bayes'theorem. And it's just, again, recombining what we already know. We want to know the probability of A given B, right? Well, the probability of A given B equals the probability of A and B divided by the probability of B. Right? Nothing new there. Right? And we know that the probability of A and B equals the probability of A times the probability of B given A. Okay? So again, this is simply the multiplication rule. And again, we're looking for a conditional, the numerator is the multiplication rule. And again, we're just calculating the probability of A given B. And all we're doing is writing out the formulas of what we know. Now, the problem here is we don't know necessarily the probability of B. If you look up Bayes'theorem online, you might find something like this as stating what it is. But oftentimes, the probability of B isn't known, right? And that adds an additional wrinkle. But it's something we can solve, right? Because we know that if you have B, if you test positive, you either have or don't have the disease. That's tautologically true, right? So the probability of B equals the probability of testing positive and having the disease. plus testing positive and not having the disease because if you test positive B You either have the disease or you don't have the disease. So now we can simply write out, well, what's the formula for A and B? Well, it's the same as it is here, right? The probability of A and B equals the probability of A times the probability of B given A. The probability of not A and B equals the probability of not A given the probability of B given not A. Okay? So, we have that information. We have that information. Alright? The probability of A is 5%, right? The probability of B given A of testing positive, given that you have the disease, is 99%. Again, A we know. B given A we know. So we have 0.05 times 0.99. And in the denominator we again have the same thing. 0.05 times 0.99. And now, well what's the probability of not A? Well... You either have the disease or you don't. So if the probability of having the disease is 0.05, the probability of not having the disease is 0.095. Sorry. 0.95, sorry. The probability of having the disease is 5%. The probability of not having the disease is 95%. And now what's the probability of testing positive given that you don't have the disease? 10%. Now this number has to be given to you. If you don't know that, you can't calculate it. It's not related. to this number per se. It's not 1 minus this number, right? You can't calculate this from the other data, right? We could calculate not A from the probability of A, but you can't calculate the probability of B given not A from the probability of B given A. Right, so again the information is given to us and we plug it in. Now we just multiply together and we get 0.05495. And that's the proportion of people, almost 5% of the people who have the disease and test positive. Right? Because 5% of the people have the disease, almost all of them test positive. So it's just a little below 5%. This is the exact same number. Right? This is the proportion of people who test positive and have the disease. But what about the people who test positive and don't have the disease? Now, only 10% of the people who don't have the disease test positive, but that's 95% of the people. So actually,.095, almost 10% of the people. will test positive even though they don't have the disease. So that's 0.095. We add those together and we get the proportion of people who actually test positive. So the proportion of people who test positive is 14.45% or 0.1445. So we can see that... 14 to 15 percent of the people test positive, but actually only about 5 percent of the people have the disease. So only about one-third of the people who test positive actually have the disease. So the probability of having the disease, given that you test positive, is 0.34256. And again, that's because most people don't have the disease, and most of the positive tests are actually false positives, given this data. Obviously, if the data doesn't look like this, we might get a completely different number. It's possible this number could, in fact, be very large. That's just the number we got with the numbers I gave. But anyway, this is Bayes'rule. helps us find the answers to questions that we seemingly don't have the information to answer, but in fact do. Okay? So again, we're just rearranging things. No new rules of probability here. Just rearranging things in a different way. This is Bayes'rule. It's used quite a lot. And that's it for today. Bye-bye.