📊

Understanding Quartiles and Box Plots

May 24, 2025

Measures of Position

Objectives

  • How to find the first, second, and third quartiles of a data set.
  • How to find the interquartile range (IQR) of a data set.
  • How to represent a data set graphically using a box and whisker plot.

Quartiles

  • Quartiles divide an ordered data set into four parts.
    • First quartile (Q1): One quarter of the data falls on or below Q1.
    • Second quartile (Q2/Median): One half of the data falls on or below Q2.
    • Third quartile (Q3): Three quarters of the data falls on or below Q3.

Example

  • Data: Amounts in gallons per year of fuel wasted by commuters in the 15 largest U.S. urban areas.
    • Ordered Data: from 20 to 35.

Calculate Quartiles

  1. Overall Median (Q2):
    • Median of the data set is 25.
  2. First Quartile (Q1):
    • Median of the lower half (11 to 25) is 23.
  3. Third Quartile (Q3):
    • Median of the upper half (after 25) is 30.

Interpretation

  • Q1 (23): About 25% of areas waste 23 gallons or less.
  • Q2/Median (25): About 50% of areas waste 25 gallons or less.
  • Q3 (30): About 75% of areas waste 30 gallons or less.

Interquartile Range (IQR)

  • Definition: Measure of variation representing the range of the middle 50% of the data.
  • Calculation: IQR = Q3 - Q1 = 30 - 23 = 7.
  • Use: Identifies outliers in the data set.

Outliers

  • Calculation:
    • Multiply IQR by 1.5 to define outlier boundaries:
      • Lower boundary: Q1 - 1.5 * IQR = 23 - 10.5 = 12.5.
      • Upper boundary: Q3 + 1.5 * IQR = 30 + 10.5 = 40.5.
    • Any data value < 12.5 or > 40.5 is an outlier.
  • Example: In the data set, 11 is an outlier.

Box and Whisker Plot

  • Five Number Summary: Minimum, Q1, Median (Q2), Q3, Maximum.
  • Construction:
    • Box is formed using Q1 and Q3.
    • Median is marked inside the box.
    • Whiskers extend to the minimum and maximum data values.
  • Example Construction:
    • Min: 11, Q1: 23, Median: 25, Q3: 30, Max: 35.
    • Left Whisker (min to Q1): Represents 25% of the data.
    • Right Whisker (Q3 to max): Represents the upper 25%.
    • Box represents the middle 50%.

StatCrunch

  • Tool: Used to calculate quartiles and IQR, and to construct box plots.
  • Note on Excel: May not align with StatCrunch results due to different algorithms for quartile calculation.

Conclusion

  • Understanding quartiles, IQR, and box plots help in analyzing the distribution and variability of a data set.