Transcript for:
Hypothesis Testing for Two Proportions

Let's just dive back into our apnea question, and let's try verifying those conditions. And as always, as always, I'm going to provide in purple scaffolding for us so that we understand, and frankly, you guys have a template in your notes on how to run these conditions. So again, let's go back to our apnea question. First things first, we need to identify the significance level. But oh man, nowhere in the prompt was it given. So can you remind me, significance level is exactly the same. So if it's not given, what did we say we would use for it? Yeah, 0.05. After that, we check our conditions, and in purple, I made a point to emphasize what are the three conditions again. The first condition is checking both samples are random, in which case they are. It's clearly given, random infants given therapy, random infants in the placebo group. And that's really all you need, all right? You just need to say and see that both samples are random. Honestly, where it gets weird is step two. All right, step two is where things get very weird, get very different. And the main reason why this gets so, so very different is because we need to find that pre step. The pre step is probably the most different part of all of section 8.4 where you need to calculate p hat X1 + X2 / N1 + N2. Or again, let's understand what the X's represent. Remember, in general, X represents the number of successes. X represents the number of successes. So honestly, what we, again, need to identify is what is considered the success here? What's the success here again? It's weirdly as it is, are those children who suffered from Death Or disability. Identifying that as my success then helps me identify what X1 and N1 are. In this case, we have from my first group 377 suffered from death and disability. For my second group, 431 suffered from death and disability. So in this case, my prompt is giving me my X1 and X2, adding those two together, 377 plus 431, is emphasizing regardless of the therapy, 808 infants suffered from death and disability. After that, we then need to divide by the sum of my sample sizes. Can you guys help me out looking at the prompt? What two numbers are my sample sizes? Yeah, 937 got the therapy, 932 got the placebo. And so adding those two values together ultimately tells me, regardless of therapy, 1,869 infants were studied. I was able to identify just how many N1 937 plus N2 932, how many babies were surveyed. And the idea here is that you will then divide 808 divided by 1,869 to give me 0.43 to give me my sped sample proportion. And it's this value, it's this value of 0.43, is what we are then going to multiply by each of my sample sizes. So I'll take my sample size of N1, my sample size of 937, and I'm going to multiply that by the pulled sample proportion to give me those number of successes. I just need to check are greater than 10. And then it's that number I will then use to calculate, well, how many failures from group one? Group one of 937 people minus those 403 according to this pooled sample proportion are going to be in the failure camp. 534 are greater than 10. I want to emphasize what's different here is that as we are calculating those number of successes and those number of failures, I want to emphasize that they are with respect to using that pulled sample proportion. Why? Because the idea of the pulled sample proportion is emphasizing regardless of what treatment, regardless of what treatment the child got, how many are going to be successful, how many are not successful. So I want to emphasize the big difference here is we're calculating those ideas of success with respect to P hat using the pulled sample proportion. So why don't you guys find then the number of successes using the pH hat and number of failures using the P hat for my second group? And since all the successes, all the failures are greater than 10, it looks like, yeah, we do have the large sample condition satisfied. Remember, there is one more condition we need to check. We need to check the condition of Independence. And that ultimately there are two sub questions when it comes to checking Independence: each other versus within. So particularly, what does it mean to check independence of each other? It means you first need to go back and consider who are my two groups. My first group are the babies who receive therapy. My second group are those who receive the placebo. And the idea of asking are these two samples independent of each other is asking the question that if we're looking at one baby who receives therapy and another baby who receives placebo, is that therapy baby going to affect that placebo baby or will that placebo baby affect that therapy baby? Because the idea here is that if they don't affect each other, these two samples are independent. So my question for you is, are these two samples independent? Is the group of babies in therapy independent separate and apart than the babies who got the placebo? Absolutely. And again, it's because either the infant was given the therapy or the placebo, they were only given one. So really the idea of the first Independence is comparing your two groups literally, the groups we identified earlier: therapy babies versus placebo babies. So then what does it mean to be independent within? Independent within is actually looking at who are we studying. See in this case, the individuals whether they're in therapy or placebo group are babies. The individuals are babies. And what about these individuals are we asking about? Well, let's go back. What we're asking these babies are, are you breathing shallowly or are you stopping breathing? That is the who and the what we are studying. The who are babies, the what is are they breathing shallowly. And so when it comes to checking the second form of Independence, you're asking the who question of who are we studying, infants, and what are they doing? We are looking at their breathing. And we are asking, we are asking are they independent of each other? You're asking the question, do babies breathe independently of each other? That is what that second question of are the individuals within each sample independent. You're asking each person we studied, which were the babies, is the actor doing breathing independent of each other? Do infants in general breathe independently of each other? Yeah, absolutely. Because ultimately we aren't saying explicitly we're looking at conjoined twins who have a shared lung. No, the assumption is when you walk into a maternity ward, the babies are breathing independent of each other individually. Again, it's because yes, there's independence within because infants breathe independent of each other. I want to emphasize you only have three conditions to check: randomness, large sample, and independence. And just like we saw from before, if all three conditions hold, that's the green light to keep moving forward because you can now do your calculations.