Understanding t-Distribution and Inference Techniques
May 7, 2025
Unit 7: Inference for Quantitative Data - Means
The t-Distribution
Used when the population standard deviation is unknown, estimate using sample standard deviation (s).
Introduced in 1908 by W. S. Gosset.
For a normally distributed population, the t-distribution is bell-shaped and symmetric but more spread out than the normal distribution.
Dependent on degrees of freedom (df), calculated as sample size minus one.
Proper choice whenever population standard deviation is unknown, which is most real-world cases.
Confidence Interval for a Mean
If sample size (n) is large:
Sample means are normally distributed.
Mean of sample means equals population mean (μ).
Standard deviation of sample means equals population standard deviation divided by square root of sample size.
Use sample standard deviation when population standard deviation is unknown; called the standard error.
Example 7.1: Gas mileage study with confidence intervals and significance testing.
Significance Test for a Mean
Requires:
Simple random sample.
Sample size < 10% of population.
Large enough sample size for CLT or approximately normal distribution.
Example 7.2: Testing manufacturer's claim about electricity usage with significance levels.
Confidence Interval for the Difference of Two Means
Sampling distribution of differences of sample means is normally distributed.
Use a t-distribution if population standard deviations are unknown.
Example 7.3: Comparing accident rates between two departments with confidence intervals.
Significance Test for the Difference of Two Means
Requires two independent simple random samples and either normally distributed populations or large enough sample sizes.
Null hypothesis usually assumes equal means.
Example 7.4: Comparing computer downtime between two companies with type I and type II errors.
Paired Data
Involves one-sample analysis on differences from paired data.
Example 7.5: SAT score improvements with confidence intervals.
Simulations and P-Values
Simulations estimate P-values by modeling test statistic distributions under null hypothesis.
Example 7.6: Machinery recalibration based on median absolute deviation calculations.
More on Power, Type I Errors, and Type II Errors
Type II error: Failing to reject a false null hypothesis.
Power: Probability of correctly rejecting a false null hypothesis.
Example 7.7: Testing political support claims with power analysis.
Confidence Intervals Versus Hypothesis Tests
Hypothesis tests assess claims about parameters; confidence intervals estimate parameters.
Example 7.8: Comparing basketball player shooting accuracy with hypothesis tests and confidence intervals. Both decisions consistent across sample sizes.