📊

Bootstrapping and Confidence Intervals

Jul 31, 2025

Overview

This lecture explains the concept of bootstrapping to create confidence intervals and compares it to traditional sampling distributions and formula-based approaches.

Bootstrapping and Sampling Distributions

  • Bootstrapping simulates the sampling distribution by repeatedly sampling from the original sample.
  • A real sampling distribution involves taking many random samples from the population.
  • In bootstrapping, the center of the distribution is the sample statistic, not the population parameter.
  • The shape and standard error of a bootstrap distribution often resemble those of a real sampling distribution.
  • Each bootstrap gives slightly different results because each set of random samples differs.

Creating Confidence Intervals with Bootstrapping

  • Bootstrapping allows for direct calculation of confidence intervals from the simulated distribution.
  • A 95% confidence interval is set by taking the middle 95% of the bootstrap results.
  • For the given data, the 95% confidence interval for the population mean is 98.053 to 98.480 degrees Fahrenheit.
  • Results may vary slightly with each bootstrap due to randomness in resampling.

Comparison: Bootstrapping vs. Traditional Formulas

  • StatKey uses bootstrapping while StatKado uses traditional normal formulas for confidence intervals.
  • Bootstrapping does not require normality or formula assumptions—it works directly from the sample data.
  • Bootstrapping can be used for parameters like the median and standard deviation, which are harder to handle with formulas.

Bootstrapping for Other Statistics

  • You can bootstrap the median to create a confidence interval for the population median.
  • You can also bootstrap the standard deviation, which can be challenging with traditional formulas.
  • Example: The 95% confidence interval for standard deviation of body temperature is between 0.572 and 0.953 degrees Fahrenheit.

Key Terms & Definitions

  • Bootstrapping — A method of creating sampling distributions by repeatedly sampling with replacement from a single sample.
  • Sampling Distribution — The probability distribution of a statistic based on all possible random samples from a population.
  • Confidence Interval — A range of values derived from the data that likely contains the population parameter.
  • Sample Statistic — A numerical summary (like mean or median) calculated from the sample data.

Action Items / Next Steps

  • Practice using bootstrapping to calculate confidence intervals for mean, median, and standard deviation.
  • Explore StatKey or similar tools to perform your own bootstrap analyses.
  • Review the differences between formula-based and bootstrap confidence intervals.