Understanding Central Limit Theorem and Normal Distribution

Oct 12, 2024

Central Limit Theorem and Normal Distribution Lecture

Introduction to the Galton Board

  • Galton Board: Demonstrates how random, chaotic events can lead to predictable distributions in large numbers of trials.
  • Normal Distribution:
    • Also known as the Bell Curve or Gaussian Distribution.
    • Common in various contexts: e.g., human heights, prime factors of large numbers.

Central Limit Theorem (CLT)

  • A key concept explaining the prevalence of normal distribution.
  • Provides a fundamental understanding of normal distributions with minimal background assumptions.
  • Theorem states that the distribution of the sum of a large number of random variables approaches a normal distribution, regardless of the original distribution.

Galton Board Model

  • Simplified model: Balls have a 50-50 chance of falling left or right on pegs.
  • Demonstrates the idea of randomness leading to a normal distribution.
  • Emphasizes illustration over realistic physics.

Understanding Random Variables

  • Random Variable: A process with outcomes assigned numerical values.
  • Example: Die roll with six outcomes.
  • Central Limit Theorem: As the sample size increases, the distribution of the sum resembles a bell curve.

Simulation and Illustration

  • Changing the distribution of a die still results in a bell curve when considering sums of rolls.
  • More dice results in distributions that more closely resemble a bell curve.

Mean and Standard Deviation

  • Mean (µ): Center of mass of a distribution.
  • Standard Deviation (σ): Measures spread of a distribution.
  • Variance adds linearly; variance of a sum is sum of variances.
  • Standard deviation of a sum is the square root of sum of variances.

Formula for Normal Distribution

  • Built using exponential decay: e^(-x^2) for a bell curve shape.
  • Adjusted for standard deviation and mean.
  • Constant: 1/(σ√(2π)) ensures total area under the curve is 1.

Quantitative Description of CLT

  • Mean for the sum is n times the mean of the variable.
  • Standard deviation for the sum is σ√n.
  • Aligning means and scaling standard deviations shows universal bell curve.

Practical Example

  • Rolling a die 100 times to find a sum range with 95% probability using the properties of a normal distribution.

Assumptions of the CLT

  1. Independence: Each variable is independent.
  2. Identically Distributed: Each variable is from the same distribution.
    • Often referred to as iid.
  3. Finite Variance: The variance of the variable must be finite.

Conclusion

  • CLT: Under certain conditions, distributions of sums tend to normal distribution.
  • Preview of future lessons about why the normal distribution has its shape and connection to π.