📊

Box Plot Interpretation and Construction

Jul 12, 2025

Overview

This lecture explains how to interpret and construct box (box-and-whisker) plots for numerical data, focusing on the five-number summary, identifying outliers, and relating box plots to histograms.

Introduction to Box Plots

  • Box plots (also called box-and-whisker plots) are used to graph numerical data.
  • Box plots visually summarize the five-number summary: minimum, Q1, median, Q3, and maximum.

Structure of a Box Plot

  • The main rectangle ("box") represents the interquartile range (IQR) from Q1 (lower quartile) to Q3 (upper quartile).
  • The line inside the box marks the median value.
  • The ends of the box are Q1 and Q3; the box length is the IQR (Q3–Q1).
  • "Whiskers" extend from the box to the minimum and maximum values not considered outliers.

Dealing with Outliers in Box Plots

  • Outliers are plotted as individual dots outside the whiskers.
  • Whiskers end at the highest/lowest data point that is not an outlier.
  • The minimum/maximum on a box plot excludes outliers.

Interpreting Data Distribution

  • 25% of data falls between each: min–Q1, Q1–median, median–Q3, Q3–max.
  • Box plots break data into four equal-sized "chunks" (quartiles).

Box Plots and Histograms

  • Box plots and histograms both show data distribution, but box plots include the five-number summary.
  • Right-skewed data have a longer right whisker and correspond to right-skewed histograms.
  • Left-skewed data have a longer left whisker and correspond to left-skewed histograms.
  • Symmetric data have whiskers of equal length.

Key Terms & Definitions

  • Box Plot/Box-and-Whisker Plot — A graph showing minimum, Q1, median, Q3, and maximum (five-number summary) of numerical data.
  • Five-number summary — Minimum, Q1 (lower quartile), median, Q3 (upper quartile), and maximum.
  • Interquartile Range (IQR) — The range between Q1 and Q3, representing the middle 50% of data.
  • Outlier — Data point significantly higher or lower than the rest of the data, shown as dots beyond the whiskers.
  • Skewed Right — Distribution with a longer right tail/whisker.
  • Skewed Left — Distribution with a longer left tail/whisker.
  • Symmetric — Distribution with whiskers of similar length on both sides.

Action Items / Next Steps

  • Identify the next largest/smallest non-outlier value for whisker placement when outliers are present.
  • Compare box plots to corresponding histograms to assess distribution shape (symmetry/skewness).
  • Practice drawing box plots using datasets with and without outliers.