module 19 wrap up hypothesis tests for a population proportion let's summarize in this section we looked at the four steps of a hypothesis test as they relate to a claim about a population proportion step one determine the hypotheses the hypotheses are claims about the population proportion p the null hypothesis is a hypothesis that the proportion equals a specific value p naught the alternative hypothesis is a competing claim that the parameter is less than greater than or equal to p naught step two collect the data since the hypothesis test is based on probability random selection and random assignment in an experiment is essential in data production if you are not using a simulation then verify that the normal curve can represent the distribution of sample proportions n times p naught and n times one minus p naught and these are associated with the null hypothesis these p naughts and so if you multiply those n times p naught and n times 1 minus p naught you know they must both be at least 10. step 3 assess the evidence conduct the simulation or determine the test statistics which is the z-score for the sample proportion the formula is z is equal to in the numerator you have p hat minus p naught and p naught this is from the null hypothesis okay this is from the null divided by the standard error based on the p from the null p naught times 1 minus p naught divided by n and take the square root of that if using the normal model find the p value using statcrunch or the normal distribution calculator if the alternative hypothesis is greater than the p-value is the area to the right of the test statistic right if the alternative hypothesis is greater than in other words you're looking at greater than if the alternative hypothesis is less than the p-value is the area to the left of the test statistic if the alternative hypothesis is not equal to the p-value is equal to double the tail area beyond the test statistic step four give the conclusion a small p-value says the data is unlikely to occur if a null is true if the p-value is less than or equal to the significance level we reject the null hypothesis and accept the alternative hypothesis instead if the p-value is greater than the significance level we say we fail to reject the null hypothesis we never ever say we accept the null hypothesis we just say that we don't have enough evidence to reject it this is equivalent to saying we don't have enough evidence to support the alternative hypothesis we write the conclusion in the context of the research question our conclusion is usually a statement about the alternative hypothesis we accept an alternative hypothesis or fail to accept the alternative hypothesis and should include the p-value inference is based on probability so there is always uncertainty although we may have strong evidence against it the null hypothesis may still be true if this is the case we have a type 1 error similarly even if we fail to reject the null hypothesis it does not mean the alternative hypothesis is false in this case we have a type 2 error these errors are not the result of a mistake in conducting the hypothesis test they occur because of random chance other hypothesis testing notes remember that the p-value is the probability of seeing a sample proportion as extreme as the one observed from the data if the null hypothesis is true the probability is about the random sample not about the null or alternative hypothesis a larger sample size makes it more likely that we will reject the null hypothesis if the alternative is true another way of thinking about this is that increasing the sample size will decrease the likelihood of a typo 2 error recall that a type 2 error is failing to reject the null hypothesis when the alternative is true increasing the sample size can have the unintended effect of making the test sensitive to differences so small they don't matter a statistically significant difference is one large enough that it is unlikely to be due to sampling variability alone even a difference so small that it is not important can be statistically significant if the sample size is big enough finally remember the phrase garbage in garbage out if the data collection methods are poor then the results of a hypothesis test are meaningless no statistical methods can create useful information if our data comes from convenience or voluntary response samples additionally the results of a hypothesis test apply only to the population from whom the sample was chosen