L29

Sep 20, 2024

Lecture Notes: Central Limit Theorem and Statistical Inference

Overview

  • The Central Limit Theorem (CLT) bridges probability theory and statistical inference.
  • CLT states that for large sample sizes, the sampling distribution of the sample mean or sum of random variables is normally distributed, even if the original population is non-normal.

Key Formulas and Definitions

  • Sample Mean (x̄): ( x̄ = \frac{1}{n} \sum x_i )
  • Standard Normal Variable (z): ( z = \frac{x - n\mu}{\sigma\sqrt{n}} )
  • Sample Variance (S²): ( S² = \frac{\sum (x_i - x̄)²}{n-1} )

Relationship Between Sample and Population Variance

  • Using expectations, ( E(S²) = \sigma² ), showing sample variance is an unbiased estimator of the population variance.

Inferential Statistics

  • Used to predict and estimate population parameters.
  • Point Estimate: Predicts a single value (e.g., GDP growth rate).
  • Interval Estimate: Provides a range of values (e.g., house price).

Characteristics of Estimators

  • Unbiased Estimator: Equal chance of predicting higher or lower.
  • Low Variance: Desirable for precision.
  • Sample mean is a good estimator due to its normal distribution with variance ( \frac{\sigma²}{n} ).

Confidence Intervals

  • Interval within which a parameter lies with a certain probability (e.g., 95% confidence interval).
  • Formula: ( x̄ \pm 1.96 \frac{\sigma}{\sqrt{n}} )
  • Example: Polar bear weight estimation (confidence interval of 970 to 1030 pounds based on a sample mean of 1000 pounds).

Understanding 95% Confidence Interval

  • Only 1 in 20 samples will have a confidence interval not containing the population mean.

Proportion Example

  • In opinion polls, the proportion follows a normal distribution.
  • Margin of Error: ( 1.96 \times \sqrt{\frac{pq}{n}} )
    • For example, 70% believe in global warming, with a margin of error indicating the true proportion lies between 0.61 to 0.79.

Conclusion

  • CLT is crucial for linking probability with statistical inference.
  • Confidence intervals provide a range for population mean estimation.

Thank you for your attention. Looking forward to the next class.