Overview
This lecture introduces the concept of p-value in hypothesis testing, explaining its definition, interpretation, and its relationship to sampling variability and the null hypothesis.
Review of Hypothesis Testing
- Hypothesis tests involve a null hypothesis (statement with equality) and an alternative hypothesis (statement without equality).
- The test statistic measures how far sample data deviates from the null hypothesis.
- Sampling variability means that random samples usually differ from the population and the null hypothesis.
Introduction to P-Value
- P-value is a tool to help address disagreements between sample data and the null hypothesis.
- The core question is: Why does my sample data disagree with the null hypothesis?
- Two possible reasons: the null hypothesis is wrong, or sampling variability caused the disagreement.
Definition and Meaning of P-Value
- P-value is the probability of obtaining the sample statistic or more extreme results, by sampling variability, assuming the null hypothesis is true.
- It is a conditional probability calculated under the assumption that the null hypothesis is true.
- "More extreme" means data further from the null hypothesis than the observed sample statistic.
- P-values are calculated using the sample statistic, such as sample mean, sample proportion, or test statistic.
Interpreting P-Values
- A low p-value (close to zero) suggests it is unlikely the data are due to sampling variability, so we reject the null hypothesis.
- Rejecting the null hypothesis implies that the null is probably wrong, although it is not certain.
- A higher p-value (not close to zero, e.g., 20%) means the data could plausibly be due to sampling variability.
- P-value interpretation is always about the null hypothesis, not the alternative.
Key Terms & Definitions
- Null Hypothesis (H₀) — a statement about the population with an equality claim.
- Alternative Hypothesis (H₁) — a statement about the population that disagrees with the null.
- Test Statistic — a value that measures the deviation of the sample from the null hypothesis.
- Sampling Variability — the natural difference between random samples and the population.
- P-value — the probability of getting the observed sample statistic or more extreme results if the null hypothesis is true.
Action Items / Next Steps
- Review the definition and logic of p-value.
- Prepare for the next lesson on how to calculate p-values.