Understanding Central Limit Theorem and Normal Distribution
Oct 12, 2024
Central Limit Theorem and Normal Distribution Lecture
Introduction to the Galton Board
Galton Board: Demonstrates how random, chaotic events can lead to predictable distributions in large numbers of trials.
Normal Distribution:
Also known as the Bell Curve or Gaussian Distribution.
Common in various contexts: e.g., human heights, prime factors of large numbers.
Central Limit Theorem (CLT)
A key concept explaining the prevalence of normal distribution.
Provides a fundamental understanding of normal distributions with minimal background assumptions.
Theorem states that the distribution of the sum of a large number of random variables approaches a normal distribution, regardless of the original distribution.
Galton Board Model
Simplified model: Balls have a 50-50 chance of falling left or right on pegs.
Demonstrates the idea of randomness leading to a normal distribution.
Emphasizes illustration over realistic physics.
Understanding Random Variables
Random Variable: A process with outcomes assigned numerical values.
Example: Die roll with six outcomes.
Central Limit Theorem: As the sample size increases, the distribution of the sum resembles a bell curve.
Simulation and Illustration
Changing the distribution of a die still results in a bell curve when considering sums of rolls.
More dice results in distributions that more closely resemble a bell curve.
Mean and Standard Deviation
Mean (µ): Center of mass of a distribution.
Standard Deviation (σ): Measures spread of a distribution.
Variance adds linearly; variance of a sum is sum of variances.
Standard deviation of a sum is the square root of sum of variances.
Formula for Normal Distribution
Built using exponential decay: e^(-x^2) for a bell curve shape.
Adjusted for standard deviation and mean.
Constant: 1/(σ√(2π)) ensures total area under the curve is 1.
Quantitative Description of CLT
Mean for the sum is n times the mean of the variable.
Standard deviation for the sum is σ√n.
Aligning means and scaling standard deviations shows universal bell curve.
Practical Example
Rolling a die 100 times to find a sum range with 95% probability using the properties of a normal distribution.
Assumptions of the CLT
Independence: Each variable is independent.
Identically Distributed: Each variable is from the same distribution.
Often referred to as iid.
Finite Variance: The variance of the variable must be finite.
Conclusion
CLT: Under certain conditions, distributions of sums tend to normal distribution.
Preview of future lessons about why the normal distribution has its shape and connection to π.