Transcript for:
Sampling Distribution Concepts

In addition to talking about Center, we always know Center and spread go together. Let's go back and remember when it came to looking at categorical data, we would calculate the spread by looking at the standard error formula. We would ultimately look at the standard error formula. The standard error formula for sample proportion was ultimately this huge formula of the square root of P * (1 - P) / by n. And ultimately, when it comes to the comparison of categorical and numerical data, really the only thing that's similar when it comes to spread, when looking at categorical and numerical data, is going to be that concept of standard error. Only the concept of calculating standard error is going to be the similarity when thinking about spread of the sampling distribution when looking at proportion and means. Why? Why ultimately is standard error going to be the same? Well, again let's go back and remember, standard error was ultimately a way for us to understand how precise, how precise my sample was. And let's go back and remember the concept of precision came from how larger sample was, how large your sample was. How large your sample is will not affect if you are asking a categorical variable like "do you love kittens," how large your sample is will not affect if you want to ask people about how fast they run a mile. So once again, the result of having a large sample leading to more Precision is once again going to apply here when studying sample means. And so once again, that measurement of standard error to calculate Precision is going to be used again here in chapter nine because ultimately when we're thinking about that concept of precision graphically, we can see here when looking at these sampling distributions as the sample size gets larger and larger, I want you to see how the normal curves become more and more narrow, more and more narrow, meaning that as the sample size got bigger and bigger and bigger, I want you to see how we had more precision. And so ultimately what I'm trying to emphasize really in these last two blocks of the page is to emphasize it almost didn't matter if we were in the world of proportion or the world of means. Once again, when your sample size is large, we will have better precision. When your sample size is large, we will have better precision. And so therefore, the way we calculate the idea of precision is once again standard error. Once again, we will calculate standard error when we're looking at situations where the sample is large. And so what are the similarities and differences then between what we saw in chapter 7 and what we're seeing here now in chapter nine? Well, one similarity is that spread again is only going to make sense if my sample is large. The idea that the standard error will get smaller and smaller and smaller as my sample size gets bigger and bigger and bigger is once again still going to be the same. But ultimately the formula, the formula for standard error for sample proportions is going to be a vastly different formula than the standard error formula we're going to use for sample means. Because ultimately the formula for standard error when looking at sample means is going to be ultimately the standard deviation of the population divided by the square root of the sample size. I'll say that one more time: what I want you guys to see is that when it comes to looking at the standard error formula for sample means, I want you to see that this formula in green is vastly different than the standard error formula we used back in chapter 7. And makes sense because ultimately, ultimately the standard error formula for proportions used p - a proportion. Proportions doesn't exist in numerical data. Numerical data is all about means and standard deviation. And so because of that notice how this formula is using Sigma. What is Sigma? Sigma is ultimately the population standard deviation. Ultimately Sigma is the population standard deviation. And so what I want you to see is that when it comes to the results of the sampling distribution, when it comes to looking at numerical data, it's really utilizing the population mean and the population standard deviation. Makes sense because numerical data always had to do with means and standard deviations. And you what I want you guys to know is that n the sample size n is ultimately in the bottom of each of these standard error formulas and that makes sense. That makes sense because we know that as the sample size gets larger, so as the bottom of the fraction gets larger, the overall fraction is going to get smaller. The error gets small, smaller we get better precision. And so notice how we still get the same result. We're still going to get the same result that as the sample size increases, the standard error will get smaller and precision will get better. And so what was the whole point of me going over these with you is for you guys to see the similarities and the differences, the similarities and the differences of sampling distribution when looking at means and proportions. That ultimately the similarities are what I emphasized in the more, frankly, muted colors in the yellow and in the gray. That ultimately, regardless of whether we're looking at categorical numerical data, sampling distributions will have a center in spread where Center will always equal to the population parameter when the sample is random. Again what's going to be the same is the spread will always be standard error where the sample size being large enough will make better error but outside of that there are some major differences and that's what I highlighted in Orange. In chapter 7 even we used proportions in chapter seven we had a different standard error and so now in chapter nine numerical data we're going to be using means and standard deviations. And so what this really means is that we need to make sure we know our symbols. Ultimately, we're shifting gears now. What we did in chapters seven and 8 was looked at proportions. But now in chapter nine as we are studying numerical data. What we are going to be doing here in chapter nine is using that sample mean x-bar to estimate the population mean and we're going to be using that sample standard deviation s to help us understand how does the standard deviation of my overall sampling distribution look. So I highly encourage you guys to write down these symbols on like a post note throw it up on your wall. Because we are going to use these symbols a lot over the next chapter nine. Now ultimately I just explained to you guys an incredible amount of formulas then concepts and theories. So what are the big things I want you to take away? Ultimately the big things I want you to take away are these two things. First, if your sample is collected randomly we can average all the sample means and it'll give us the population mean. That's the first takeaway I want us to have. What's the second takeaway? The second takeaway is that we can find standard error using the standard error formula: Sigma (population standard deviation) divided by the square root of the sample size, and that if your sample is large enough it will make your error go down.