All right, guys, here we go! We are going to start Section 7.3, and this is super exciting because we are going to study the Central Limit Theorem. Now, I understand you guys are like, "Okay, okay, it's okay, Shannon, but why is the Central Limit Theorem so exciting?" Well, as its name emphasizes, it is central. It is central to the study of statistical inference; it is central to statistics overall because without it, we literally could not make the jump from sample to population. All right, so here we go again. What's the main point here? The Central Limit Theorem is literally what helps our study of statistical inference. Why? Because the Central Limit Theorem is literally what allows us to take a good sample and then stretch that to say, "This represents my population." It stretches the sample so that I can understand my population. Why is this so exciting? Well, because at the beginning of Chapter 7, we talked about how you can have a bad sample. We talked about all of the terrible biases that can exist that would make a bad sample. And on top of that, we just talked about how, man, if you have a sample way too small, it's not going to represent your overall population. See what we did? We have talked for the last seven weeks about what makes a good sample, and what we're going to do over the next few chapters is then take this idea of a good sample and use that to understand my population. Now, before we dive into it, I do need to make a point to you guys that the Central Limit Theorem will have two versions, all right? We're going to have a version of Central Limit Theorem that we're going to learn today. It's going to be specifically the Central Limit Theorem when we're working with sample proportions. Why? Well, because ultimately in Chapters 7 and 8, we're going to be studying categorical data, and the way you study categorical data is going to be different from the way you study numerical data. And so, inherently, you're going to need a Central Limit Theorem for categorical data, for proportion. And then a little bit later on, we're going to take everything we learned about Central Limit Theorem and just do a little bit of a side cha-cha, a little bit of a side shift, and then apply it again, but for sample means, apply it again for looking at numerical data in Chapters 9 and Chapters 11. All right, so I want you to just keep note: Central Limit Theorem is going to happen twice, all right? And that particularly what we are going to do for the next few weeks in Chapters 7 and 8 is focus on categorical data, therefore looking at proportions. All right, here we go. So, when it comes to studying the Central Limit Theorem, the idea is that if some basic conditions are met, it means that one and only one good sample can be used to understand my population. I need to make a point to say one and only one sample, meaning if you do the Central Limit Theorem right, you don't need to do trial after trial after and the Central Limit Theorem says, "Gone are the days you'll need to do multiple trials if certain conditions are met and you make a good sample, that one sample is all you need." So, what about this sample do we need? Ultimately, when it comes to the Central Limit Theorem, the Central Limit Theorem is going to have two parts to it, all right? It's going to have Part One and Part Two, all right? So, let's talk about each part. Part One is going to be about these conditions for the Central Limit Theorem, and the conditions are going to be: a random sample, a large sample, and a large population. You're like, "Wait, didn't we just talked about this?" My answer is yes. The answer is yes, yes, these conditions for the Central Limit Theorem are 100% based on what we talked about. Let's go back and remember if your sample was collected randomly, randomly, it means then that we would have an accurate sample proportion. Let's go back and remember if you have a large sample, it means you will have a precise sample proportion. And so, what do these two things both lead to? It leads to the fact that we will be making a good sample. Ultimately, the first two conditions of the Central Limit Theorem are all about enforcing the idea of, did we make a good sample? The idea of the Central Limit Theorem conditions ultimately is to answer the question, did I make a good sample? And that is done by identifying, was your sample collected randomly, and was your sample large enough? Now, keep in mind, when we say large sample, it's no longer going to be some kind of handwavy, wishy-washy explanation of what large is. Remember back in Chapter 1, we kind of argued about, well, is that sample large enough? What makes a sample large enough? Well, now we have an exact definition. Here we go, it needs to be two things, that's why we have the word "and", you need both at least 10 successes as well as 10 failures, yeah, to have a large sample, you need two things satisfied. So, say we're asking the question, do you love puppies? You need at least 10 people saying, "Yes, I love puppies," but in addition to that, for your sample to be large enough, you also need at least 10 people saying, "No, I don't love puppies." The final condition that we need is we also need the population to be large enough, and again, a large enough population is one where simply if you take 10 times the sample size, your population is still bigger than that. Part One of the Central Limit Theorem is all about checking if the sample is good enough. It's all about checking if the sample satisfies all, and I need to emphasize, all the conditions of the Central Limit Theorem for proportions. Part One is all about identifying if the sample satisfies all the conditions, and what I want you to note is that Part One is all about the sample. It's asking a question, was the sample collected randomly, is the sample large enough? Even when you're looking at large population, it's still asking you about the size of the sample. Part One of the Central Limit Theorem is making sure certain conditions are met so that we have a good sample. Why? Well, it's only because if all the conditions hold, can we, though, then go to Part Two? Part Two of the Central Limit Theorem. See, Part Two of the Central Limit Theorem is then results about the sampling distribution. Part Two is literally going to be statements about these sampling distributions we developed. And here we go, guys, this is the big one. This is the big one. The sampling distribution. If you were to take all of those P-hats that you were to gather, if you did a 100,000 trials, if you were to take all those P-hats and put them into a histogram, you would see the shape is normal. Part Two, the results of the Central Limit Theorem are then going to emphasize the sampling distribution is normal in shape. Why is that so important? It's because literally throughout Chapter 3 and Chapter 6, we discussed normal distributions so much. We talked about normal distributions, symmetric distributions in 3.1 and 3.2, and then it came up again in Section 6.2, where particularly in Section 6.2, we said that if you had a normal graph, if you had a normal set of data, we then could find probability by using normalcdf. And that is why knowing your shape is normal is so important because then this opens the door to answering questions about probability. We can then find probability by using our friend normalcdf, identifying what's the lower bound, what's the upper bound, what is the mean, and what is the standard deviation. That is the reason why that first result of the Central Limit Theorem is so powerful because if the shape is normal, it opens up the door of all previous chapters and allows us to use what we learned. But the rub is, what is mean and standard deviation here? And that is where those formulas we learned are going to come in handy because then the center of the sampling distribution will then be the population proportion. Literally, that formula we learned before the break will then become the value we use for the mean in normalcdf, and the spread, the spread of my sampling distribution will then be that standard error formula that we worked with. That number you got for standard error will then be that standard deviation value in normalcdf, guys. Ultimately, the results, the results of the Central Limit Theorem are so powerful because it opens the door for us to answer probability questions, which is honestly the majority of what we want to know in statistics. The majority of the time in statistics, we're asking the question, what is the chance of something happening? What's the chance it's going to rain? What's the chance the Ners are going to win the Super Bowl? What is the chance that my dog is going to start singing? The idea of statistics is to answer probability questions, and now the Central Limit Theorem has now brought everything we have learned and is allowing us to be able to answer those questions. And so, I'm going to have a zoom out. I want you to understand both parts of the Central Limit Theorem are important. See, Part One is all about making sure we made a good sample by making sure these three conditions hold because if those three conditions hold, if this is in fact a good sample, what this allows me to do is get to the results of the Central Limit Theorem. It allows me then to understand my population, answer questions about probability for my population by utilizing these formulas for center and spread.