Coconote
AI notes
AI voice & video notes
Try for free
📊
Understanding Five Number Summary and Box Plots
May 30, 2025
Five Number Summary, Box Plots, and Outliers
Overview
The
Five Number Summary
gives a concise description of a data distribution using only five numbers:
Minimum
First Quartile (Q1)
Median
Third Quartile (Q3)
Maximum
Key Concepts
Five Number Summary
Minimum
: Smallest value in the data set.
Maximum
: Largest value in the data set.
Median
: Middle data value; 50% of the data values are below and above it.
First Quartile (Q1)
: Median of the lower half of the data set; 25% of data values are below it.
Third Quartile (Q3)
: Median of the upper half of the data set; 75% of data values are below it.
Calculating the Five Number Summary
Order data values from smallest to largest.
Determine the median, Q1, and Q3 by dividing the data into halves and finding medians of these halves.
The
Interquartile Range (IQR)
is calculated as Q3 - Q1, representing the middle 50% of the data.
Box Plots
Box Plot
: Visual representation of the five number summary.
Vertical lines represent the five numbers.
Horizontal line extensions ("whiskers") show data spread beyond the quartiles.
The box represents the IQR.
Modified Box Plot
: Adjusted for outliers.
Outliers are shown as separate dots.
Whiskers extend only to the highest non-outlier value.
Identifying Outliers
A data value is an outlier if:
Less than Q1 - 1.5 * IQR
Greater than Q3 + 1.5 * IQR
Example: For a data set with Q1 = 25, Q3 = 36, and IQR = 11:
Low boundary for outliers: 8.5
High boundary for outliers: 52.5
Any value greater than 52.5 is an outlier (e.g., 59 is an outlier).
Comparing Data Sets
Side-by-side box plots allow for mathematical and visual comparisons between different data sets.
📄
Full transcript