Example two: The ethnic breakdown in a certain community is known to be 60% White, 20% Hispanic, 10% Asian-American, and 10% African American. Suppose 200 people are called for jury duty, and the ethnic breakdown of the potential jurors is shown in the chart below. So here’s our observed data, right? This is from potential jurors—actual people that showed up. Does this ethnic distribution of the jury pool match that of the community? Use a significance level of 0.05. All right, so clearly we've got ourselves a categorical variable. Variable is ethnicity. We've got four categories here. Before I get too far, let's do a degree of freedom calculation real quick. So we said four categories, so degree of freedom is 4 minus 1, so we have a degree of freedom of 3. Expected: To figure out what all of our expected values are, we know the ethnic breakdown of this community, and we can use that to figure out what we actually would have expected this to be. So, n = 200. We need to figure out what is 60% of 200, 20% of 200, and then 10% of 200. You can’t see me, but off-screen I'm going to my graphing calculator and I'm doing these calculations. 60% of 200 is 120. 20% of 200—I think it’s 40. So then 10% of 200 must be 20, 20. If I look at all these numbers added up—yep, I still get to 200. The expected counts—still awesome. All right, we’ve done a lot of pre-work. Let’s go ahead and jump into our hypothesis. All right, so what’s our null hypothesis? The null hypothesis is a safe assumption. Safe assumption is this jury pool is coming from the community. It should have the same exact ethnic breakdown. So, H-naught, the ethnic breakdown of the jury and potential jurors matches the community. Or rather than writing a sentence, you could do this—you don't have to write this whole sentence: Observed = Expected. Alternative hypothesis: They’re not the same. They don’t match. In general, I do find it actually really nice to write out the sentence for H-a. I do kind of recommend that, because when we do our interpretation, we have to rewrite that sentence anyway. So if we've already taken one stab at it, it'll make that interpretation later a little bit easier. But if you don't want to do that, you could just write down: H-naught is O = E, and H-a is O ≠ E. Prepare: We need our alpha, we need our significance level—it was given to us. Fortunately, significance level of 0.05, so there's our alpha—5%. Conditions: Random? Yeah, we assume it is. Independent measurements? Yeah. Large sample? We need all of those expected values to be at least 5. So the smallest E was 20, which is ≥ 5. So we're good. This is the one condition you will need to show on test or quiz. Compute: If you’re lucky and you have a TI-84, you just go to the Chi-square goodness of fit test. Maybe first put all this observed data here into list one, all this expected data into list two. So you don’t have to exit out after the fact. If you’re unlucky though, you have to do this by hand using the TI-83—a little bit more. Let me go ahead and walk you through those steps on my graphing calculator. Still good to know what the test name is, so I'd still write down: Chi-square goodness of fit test. If you have a TI-83, let me go ahead and show you. Okay, so first let’s put all our data into our list: Stat → Enter. Let’s clear out my list that I have here. Arrow up, Clear → Enter → Over → Up → Clear → Enter. List one is my observed data. So first cell was: 137 → Enter 23 → Enter 18 → Enter 22 What we expected, based off of the breakdown that we were given, is we should have had: 120 White 40 Hispanic 20 Asian-American 20 African American All right, if you're using a TI-83, we need to use list 3 to go ahead and calculate all those (O - E) squared divided by E values. So we come over here, arrow up, hit enter. Down here, we put in our formula: Parentheses (Observed – Expected) → which is my list one minus list two Parentheses squared, divide by expected (which is list two), and then enter. And it’ll do this calculation for each of those rows. And then I want to add these values up. You might be able to just do this real quick on a calculator, but I’m going to go ahead and go to: Stat → Calc → 1-Var Stats → List 3 → Calculate, and there’s the sum right there of my list three—so there’s my Chi-square. All right, if you have a TI-83, now we have to look up the p-value manually. It’s not going to be done for us. So we would come over here, go to: Second → Vars → Scroll to Option 8 (or press 8): chi-square CDF. I type in my chi-square value that I just had: 10.033. Upper is always going to be 10^9—always a right-skewed distribution, so we’re always shading into the tail. And our degrees of freedom we calculated as 3. Paste → Enter → There’s my p-value: 0.008. All right, because this is a TI-84, I can also go the other way and it’s much faster. So let’s just see that quick: Stat → Tests → chi-square goodness of fit. Yes, my observed is in list one. Yes, my expected is list two. Degrees of freedom is 3 in this case. Draw. There’s my right-skewed graph. I see the p-value is shading into the tail, and I see that yeah, that’s the same chi-square and p-value that I just found. It was a quarter of the time—it was much faster. So hopefully you have a TI-84, or you can maybe borrow one to do the 10.2.2 homework. For the test, I promise I won’t make you do a chi-square goodness of fit calculation, but you will be tested on knowing when to use a chi-square goodness of fit test. All right, going back to my notes. All right, so we’ve done our work. We have our test statistic, we have our p-value. Now we need to interpret. So we look at our significance level and p-value like we always have. We see in this case: smaller. So do we reject or fail to reject? Hopefully you agree with me that it’s reject. We reject the null hypothesis. 0.018 is less than 0.05, so we reject. That means that yeah, we actually have enough evidence that this jury pool is not matching the community—and that’s a concern, right? That’s something that could possibly result in, like, a mistrial or something. So we have enough evidence that the ethnic breakdowns of the potential jury and the community actually don't match. All right, that’s it for 10.2.