Understanding Variance and Standard Deviation

Jul 24, 2025

Overview

This lecture explains how to calculate and interpret variance, standard deviation, typical values, and outliers in a dataset with a normal (bell-shaped) distribution.

Variance and Standard Deviation

  • Variance is calculated as the average of squared deviations from the mean and is the square of the standard deviation.
  • Standard deviation is the square root of the variance and represents average distance from the mean.
  • In the brick weight data example, the standard deviation was approximately 14.7 kilograms.
  • Standard deviation is rounded to one more decimal place than the original data.

Identifying Typical Values

  • Typical values fall within one standard deviation above or below the mean.
  • For normal data, about 68% of values are within one standard deviation from the mean.
  • Calculation: Mean ± Standard Deviation (e.g., 36.1 kg ± 14.7 kg yields 21.4 kg to 50.8 kg for typical brick weights).
  • Visually, these are the middle data points in the plot.

Identifying Outliers

  • Outliers are values more than two standard deviations away from the mean.
  • High outliers: ≄ Mean + 2 Ɨ Standard Deviation (top 2.5%).
  • Low outliers: ≤ Mean – 2 Ɨ Standard Deviation (bottom 2.5%).
  • In the example, cutoffs were 65.5 kg (high) and 6.7 kg (low).
  • Bricks weighing 70 kg (high) and 3 kg (low) were considered outliers.

Data Distribution Zones

  • Data is not just typical values and outliers; there is a "could happen" zone between them.
  • Most data (middle 68%) is typical; outliers are only a small portion at the extremes.
  • Not all values are typical or outliers.

Data Analysis Report Summary

  • The dataset was normal (bell-shaped) with a mean of 36.1 kg and a standard deviation of 14.7 kg.
  • Typical values ranged from 21.4 kg to 50.8 kg.
  • There were two outliers: 70 kg (high) and 3 kg (low).

Key Terms & Definitions

  • Variance — The average of squared differences from the mean; symbolized as s².
  • Standard Deviation — The square root of variance; measures spread from the mean.
  • Mean — The average of all values in a dataset.
  • Outlier — A value more than two standard deviations from the mean.
  • Typical Value — A value within one standard deviation of the mean.

Action Items / Next Steps

  • Practice calculating variance and standard deviation with sample data.
  • Identify typical values and outliers in a given dataset.
  • Prepare for discussion on the empirical rule (68-95-99.7%) in future lectures.