Understanding Five Number Summary and Z-Scores

Sep 9, 2024

Five Number Summary and Box and Whisker Plots

Five Number Summary

  • Definition: A concise summary of a dataset using five key points.
    • Minimum: The smallest data point.
    • Q1: The first quartile, separating the lowest 25% of the data.
    • Median (Q2): The middle value of the dataset.
    • Q3: The third quartile, marking 75% of the data.
    • Maximum: The largest data point.

Box and Whisker Plot

  • Description: A graphical representation of the five number summary.
    • Box: Extends from Q1 to Q3, with a line indicating the median (Q2).
    • Whiskers: Lines extend from the box to the minimum and maximum values.
    • Scale: Always includes a horizontal scale showing data values.

Example 4

  • Task: Draw a box and whisker plot using a dataset.
  • Process:
    • Determine the horizontal scale from minimum (11) to maximum (35).
    • Align Q1 (23), median (25), and Q3 (30) on the scale.
    • Complete the box and whiskers representation.

Percentiles and Fractiles

Definitions

  • Percentiles: Divide data into 100 equal parts.
  • Deciles: Divide data into 10 equal parts.
  • Quartiles: Divide data into 4 equal parts.

Usage

  • Educational/Health Applications: Indicate relative standing (e.g., a child in the 95th percentile is taller than 95% of peers).
  • Quartiles and Percentiles:
    • Q1 = 25th percentile
    • Q2 (Median) = 50th percentile
    • Q3 = 75th percentile

Example 5

  • Task: Interpret percentiles using an ogive (cumulative frequency graph).
  • Example: The 80th percentile on a SAT score graph is approximately 1250.

Example 6

  • Task: Calculate the percentile corresponding to a data point ($34,000 tuition).
  • Process:
    • Count data entries less than $34,000 (8 entries of 25 total).
    • Calculate percentile: ((8/25) \times 100 = 32%
    • Interpretation: $34,000 tuition is higher than 32% of the dataset.

Standard Scores (Z-Scores)

Definition

  • Z-Score: Measures how many standard deviations a data point is from the mean.
    • Positive if above mean, negative if below.
    • Formula: (z = \frac{x - \text{mean}}{\text{standard deviation}})

Typical Ranges

  • Normal scores: Within ±2 standard deviations from the mean.
  • Unusual scores: More than ±2 standard deviations.
  • Very unusual: More than ±3 standard deviations.

Example 7

  • Dataset: Vehicle speeds with mean = 56 mph, standard deviation = 4 mph.
  • Speeds: Car 1 = 62 mph, Car 2 = 47 mph, Car 3 = 56 mph.
  • Calculations:
    • Z-score for Car 1 (62 mph): 1.5 (normal)
    • Z-score for Car 2 (47 mph): -2.25 (unusual)
    • Z-score for Car 3 (56 mph): 0 (exactly the mean)