Overview
This lecture covers percentiles, quartiles, and the five-number summary—tools for describing data distribution without assuming normality. These concepts prepare for creating box plots to visualize data shape.
Percentiles
- The P-th percentile is a value where P percent of data fall below it.
- (100 - P) percent of data fall at or above that value.
- Example: 89th percentile means 89% score below, 11% score at or above.
- No 0th percentile exists (only one value would qualify).
- No 100th percentile exists (cannot have 100% below a value that is itself part of the dataset).
Quartiles
Quartiles divide data into four equal parts, similar to quarters of a dollar.
| Quartile | Percentile | Special Name |
|---|
| Q1 | 25th | First Quartile |
| Q2 | 50th | Median (Second Quartile) |
| Q3 | 75th | Third Quartile |
- Q2 equals the median—half the data falls below this value.
Finding the Median
- For even number of values: Average the two middle values.
- Formula: Location = (n + 1) / 2. If result is 3.5, average 3rd and 4th values.
- Example: {1, 2, 4, 6, 8, 9} has n = 6, location = 3.5, median = (4 + 6) / 2 = 5.
- For odd number of values: Select the middle value.
- Example: {1, 3, 5, 6, 7} has n = 5, median = 5.
- Alternative method: Mark off values from ends until one or two remain; average if two.
Procedure for Finding Quartiles
- Step 1: Order data from smallest to largest.
- Step 2: Find Q2 (median) using the entire dataset.
- Step 3: Find Q1 as the median of values below Q2 (exclude Q2 itself).
- Step 4: Find Q3 as the median of values above Q2 (exclude Q2 itself).
Worked Example with 22 Values
Dataset: {111, 120, 136, 142, 158, 182, 184, 185, 192, 194, 209, 234, 261, 271, 289, 290, 319, 335, 359, 387, 411, 439}
| Statistic | Calculation | Value |
|---|
| Q2 (Median) | (209 + 234) / 2 at position 11.5 | 221.5 |
| Q1 | Median of first 11 values, position 6 | 182 |
| Q3 | Median of last 11 values, position 6 (renumbered) | 319 |
- For n = 22: Median location = (22 + 1) / 2 = 11.5 (average 11th and 12th values).
- Q1 from first 11 values: Location = (11 + 1) / 2 = 6.
- Q3 from last 11 values: Location = (11 + 1) / 2 = 6.
Five-Number Summary
The five-number summary provides a complete description of data distribution for creating box plots.
| Component | Description | Example Value |
|---|
| Minimum | Lowest value | 111 |
| Q1 | First Quartile (25th percentile) | 182 |
| Q2 | Median (50th percentile) | 221.5 |
| Q3 | Third Quartile (75th percentile) | 319 |
| Maximum | Highest value | 439 |
- These five values are used to construct box plots in the next lecture.
- This method works for any distribution, not just normal distributions.
Key Terms & Definitions
- Percentile: Value below which a specified percentage of data falls.
- Quartile: Values dividing ordered data into four equal parts.
- Median: Middle value; also called Q2 or 50th percentile.
- Five-Number Summary: Minimum, Q1, Q2, Q3, and maximum values describing data distribution.