Understanding P-Values in Hypothesis Testing

Overview

This lecture introduces the concept of p-value in hypothesis testing, explaining its definition, interpretation, and its relationship to sampling variability and the null hypothesis.

Review of Hypothesis Testing

Hypothesis tests involve a null hypothesis (statement with equality) and an alternative hypothesis (statement without equality).
The test statistic measures how far sample data deviates from the null hypothesis.
Sampling variability means that random samples usually differ from the population and the null hypothesis.

Introduction to P-Value

P-value is a tool to help address disagreements between sample data and the null hypothesis.
The core question is: Why does my sample data disagree with the null hypothesis?
Two possible reasons: the null hypothesis is wrong, or sampling variability caused the disagreement.

Definition and Meaning of P-Value

P-value is the probability of obtaining the sample statistic or more extreme results, by sampling variability, assuming the null hypothesis is true.
It is a conditional probability calculated under the assumption that the null hypothesis is true.
"More extreme" means data further from the null hypothesis than the observed sample statistic.
P-values are calculated using the sample statistic, such as sample mean, sample proportion, or test statistic.

Interpreting P-Values

A low p-value (close to zero) suggests it is unlikely the data are due to sampling variability, so we reject the null hypothesis.
Rejecting the null hypothesis implies that the null is probably wrong, although it is not certain.
A higher p-value (not close to zero, e.g., 20%) means the data could plausibly be due to sampling variability.
P-value interpretation is always about the null hypothesis, not the alternative.

Key Terms & Definitions

Null Hypothesis (H₀) — a statement about the population with an equality claim.
Alternative Hypothesis (H₁) — a statement about the population that disagrees with the null.
Test Statistic — a value that measures the deviation of the sample from the null hypothesis.
Sampling Variability — the natural difference between random samples and the population.
P-value — the probability of getting the observed sample statistic or more extreme results if the null hypothesis is true.

Action Items / Next Steps

Review the definition and logic of p-value.
Prepare for the next lesson on how to calculate p-values.