Transcript for:
Central Limit Theorem Overview

So, again, before we can even understand my sampling distribution by looking at part two, the results of the central limit theorem. Again, you always need to do step one of the central limit theorem, which is practically check the conditions for my central limit theorem. Now, again, what are we looking at? We're looking at 30 women were randomly selected. So again, already we see that we have random selection; we're good there. And again, who are we studying? Ultimately, we are studying women. And so when it comes to checking my large population, I'm really just needing to check that we have 10 times the sample size. My sample size of 30. So in this case, 10 times that is 300. So in this case, we need to just check that there are definitely over 300 women. And in this case, is there definitely over 300 women? Yeah, absolutely. So ultimately, I kind of breezed through these first and third conditions really fast because this is exactly what we have seen back in chapters seven and eight, exactly the same as you saw from your previous chapters. And so again, the only condition that gets complicated is the normal condition. All right, the only condition that's really different is going to be condition two. Now again, how do we check condition two? Well, for starters, it's an "OR" between these, meaning only one needs to hold. So the first one is asking about the shape of the population, and literally, you just look to see if it's given, and if it's given, is it normal? And again, we were told that the population had a shape that was skewed right. We were literally given the population is skewed right. But again, when it comes to the normality condition, what we want for the population is to have a shape that is normal, which sadly it is not. We all know skewed right is not normal. But remember that word "OR" is then emphasizing that's okay. It's okay that the first condition didn't work because as long as the second one works, we're still good. And so remember, remember the second condition is taking that sample size of 30 and asking if it's greater than a specific number. Can you guys look back on the previous pages? Yeah, we need it to be greater than 25. And in this case, is a sample size of 30 greater than 25? Yeah, absolutely, smiley face. And so the idea about the normal condition is as long as you have one smiley face, as long as you have one that holds, that is enough for checking the second condition. What's so fun about these conditions is that they're really easy. It's all just visually checking: Was your data collected randomly? Yep. Was your population either normal or sample bigger than 25? Yep. Is your population 10 times the sample size? Yep. All these conditions are really straightforward to check and that once all three of these conditions hold, ultimately, we then can go to step two, the results of the central limit theorem. Where can you guys remind me when it comes to the central limit theorem what is always the shape of the sampling distribution? Yeah, it is always, always, always normal. And while that seems like a bit of a benign result, I need to emphasize to you guys why this is so important. This is so important because this normal shape means that we can calculate probability questions using normalcdf. Remember we talked about that in chapter six again in chapter seven that when the shape is normal it allows us to use normalcdf to find the area under the curve to find probability. And again, why is that so powerful? Well, because normalcdf and being able to find probability is then what allows us to do things like find the confidence intervals and run hypotheses testing. So what I want you to understand is that this shape being normal is the first domino that we need to be able to do much more complex statistical inferences like confidence intervals and hypothesis testing so that one little word normal is actually doing so much for us. After that, we want to be able to calculate the center and the spread. Now again, remember that when these Central Limit Theorem conditions hold, particularly when the randomness condition holds, that is what gives us the center of my sampling distribution. Because the randomness of my sample tells me then that the average of all the sample means should equal my population mean, and we know what the population mean is. Here we said that the population mean was 74 BPMs. And so again, that simple observation of checking if your sample was collected randomly leads to such a powerful result of saying that this sampling distribution has the same center as my population, and that we can calculate this spread by using the standard error formula. Again, the standard error formula utilizes the population standard deviation divided by the square root of the sample size. And what's great about this spread formula is literally the blue standard error and orange sample size is going to be given to you in the prompt. Notice that the blue standard deviation of 13 BPMs is given. Notice that in the prompt, the sample size (N) of 30 women is given. And so what I want you guys to see here is that when it comes to calculating standard error, honestly, the hardest part is just remembering what is this formula. And here's my hint: don't memorize it, put it on your note sheet. Why did you guys calculate that for me? What is the point of this sampling distribution? The point of the sampling distribution is literally to use the information from the sampling distribution to make a statistical inference. Again, the idea of a statistical inference is to make a statement about what we expect to happen, what we predict to happen for my entire population. Again, women are my population, and that ultimately what we're going to do in this statistical inference is use that center that we found from the Central Limit Theorem. So we expect that the mean pulse rate for my population of women will be 74 BPMs. The idea here is that this sampling distribution is allowing us to write a prediction, an expectation for my population, and that center is that prediction. But it's not infallible, it's not absolutely and only correct at 74 BPM. We all know every woman does not have exactly 74 BPM. And so what this statistical inference is doing is also giving us a way to measure that error. It's giving us a way to measure the uncertainty, and that uncertainty is then ultimately that spread. It's ultimately that spread we calculated in the result of the Central Limit Theorem. That spread of the sampling distribution will be that give or take, will be that uncertainty of 2.37 BPM. I know it's taking a little bit of brain stretching here, again, really trying to emphasize in yellow for helping us be able to identify what is the same, all right, what are things that are kind of similar, and that really trying to help color code where are things really almost flowing from the entire problem. Like I want you guys to see at the very beginning of the problem we were given that population mean and how that ultimately was utilized in the center when it came to the Central Limit Theorem results which ultimately was used in my statistical inference sentence.