📊

Ch.3 Variance and Standard Deviation

Sep 9, 2025

Overview

This lecture introduces measures of spread (dispersion) in statistics, explaining how data varies around the center, why it matters, and how to calculate key measures like range, interquartile range, variance, and standard deviation.

What Are Measures of Spread?

  • Measures of spread describe how data values are distributed around the center (mean or median).
  • They help determine how well the mean or median represents the data set.
  • Common uses include fields like test scoring, economics, investing, gambling, and polling.

Key Measures of Spread

Range

  • Range = largest value - smallest value in a data set.
  • Shows the distance between the most extreme data points.
  • Sensitive to outliers and does not reflect the core of the data.

Interquartile Range (IQR)

  • IQR = Q3 – Q1, where Q1 and Q3 are the medians of the data's lower and upper halves.
  • Represents the spread of the middle 50% of data.
  • Less affected by outliers than the range.

Variance

  • Variance measures how far each data point is from the mean, averaged as squared deviations.
  • For a sample, variance = sum of squared deviations from mean divided by (number of data points – 1).
  • Units of variance are squared units (e.g., seconds², wins²).
  • Variance increases with more spread-out data and is affected by outliers.

Standard Deviation

  • Standard deviation = square root of variance, returning to original units.
  • Indicates the average amount a data point deviates from the mean.
  • High standard deviation means data is more spread out; low means data is clustered near the mean.
  • Like variance, it's sensitive to extreme values.

Practical Implications & Examples

  • Removing outliers (extreme values) lowers the range, variance, and standard deviation but affects the mean more than the median.
  • Standard deviation helps interpret how well the mean summarizes data.
  • Growing a YouTube channel example: higher standard deviation in audience age means a more diverse viewer base.

Key Terms & Definitions

  • Range — Difference between the largest and smallest data points.
  • Interquartile Range (IQR) — Difference between the third and first quartiles (Q3 – Q1), shows spread of the middle half.
  • Variance — Average squared distance of each data point from the mean.
  • Standard Deviation — Square root of variance; average distance of data points from the mean.
  • Outlier — Data point much higher or lower than most others, greatly impacts measures of spread.

Action Items / Next Steps

  • Practice calculating range, IQR, variance, and standard deviation with sample data sets.
  • Watch for outliers when interpreting mean and standard deviation.
  • Review the next lecture for more on outliers.