Transcript for:
Statistical Inference Tools

In Chapter Seven, we talked about confidence intervals, and now here in Chapter Eight, we talked about hypothesis tests. And while they are clearly different topics, I covered them in different chapters. I want you guys to remember that they do both fall under the umbrella of being a statistical inference. I want you to remember that confidence intervals and hypotheses tests all fall under the umbrella of being a statistical inference. And again, what's the point of a statistical inference? It's to understand my population. Remember, that was the whole point. That's what we learned at the beginning of Chapter 7 - that we needed to do these statistical inferences to understand populations that are too great to study in its entirety. We did statistical inferences so we could answer important questions like "What percent of Americans love kittens?" But the idea of the statistical inference is it allowed us to look at a population so big by looking at a sample that was reasonable to collect. And when it came to doing this statistical inference, we had one of two ways of running it, and that was a confidence interval or hypothesis test. Now we've done confidence intervals, and we've now done hypothesis tests, and I want to talk about what makes them different. Going back to a confidence interval, the idea of a confidence interval is it's helping us find a specific value for that percent we are wanting to find. So if my question is "What percent of Americans love kittens?" Will a confidence interval like the confidence interval from 6 to 7 is then answering that question by saying that between 60% to 70% of Americans love kittens? The idea of the confidence interval is that it can give us a range in which that percentage will ultimately be. Confidence interval is actually interested in the value of this percent we want to find, and while we're not going to give an exact number, we'll give a range in which that number can fall. Whereas hypothesis testing didn't care. It didn't care what that percentage was but rather it wanted us to compare it to a status quo. So for instance, if my question said something like "Well in 2000, I surveyed all Americans. I was so determined to know all Americans' results, I asked every single American, 'What percent of you love kittens?' And we found 62% of Americans love kittens. I have my status quo. And ultimately what hypotheses testing did was just asking the question, 'Does that parameter still equal that 60% or is it something different? Does that percentage still equal 62% or is that percentage greater than 62%?' Hypothesis testing actually didn't care what the percentage was. We just wanted to compare it to a status quo. Notice they are different questions. Confidence interval wanted a particular value whereas hypothesis testing just looked for an inequality, a comparison. And yet both still follow what we want in a statistical inference - to understand my population, to understand what range of percentages of Americans love kittens or just is the percentage greater than 62%. It still helps me understand my population. And so it's with that, I want you guys to look at Example Five. Take a moment, read each question in A and B, and determine which method would you use. Would you use a confidence interval or hypothesis test? Which method is more appropriate if you want to know the population percent of people who prefer Pepsi? Yeah, it's going to be a confidence interval because I want to know the exact value. If you want to know the exact value, well, frankly, that's not possible unless you actually survey everyone. So the next best thing we can do is find a range like 60% to 70%, find a range of values leading us to confidence intervals. Versus in B, which method is more appropriate if you want to know if more than 50% prefer Pepsi? Yeah, because in this case, in this case, we don't care what the percentage is. We're just interested in the comparison, the comparison of more than. The idea is that when you have a word like "more than" which gives us an inequality that gives us some sort of comparison word, that is when we use hypothesis testing.