Overview
This lecture explains how to choose appropriate measures of center (mean or median) and spread (standard deviation or IQR) based on the shape of data distributions.
Choosing Center and Spread
- Identify the shape of your graph before selecting measures of center and spread.
- Use the median for the center and IQR (Interquartile Range) for the spread with skewed data (left or right).
- Use the mean for the center and standard deviation for the spread with symmetric data.
- The shape of your data directly determines which measures to use.
Rationale for Choosing Measures
- Mean and standard deviation use all data values, making them suitable for symmetric distributions.
- In symmetric distributions, mean and median are nearly identical and interchangeable.
- Skewed distributions contain outliers that pull the mean away from the majority of data.
- Median and IQR are less affected by outliers, so they are better for skewed data.
Interpretations
- Both mean and median represent the "typical value" or center of the data.
- Standard deviation describes the spread for about 68% of data (in normal distributions).
- IQR describes the spread of the middle 50% of data.
Key Terms & Definitions
- Mean — the arithmetic average, calculated by summing all values and dividing by the number of values.
- Median — the middle value when data are ordered; divides data into two equal halves.
- Standard Deviation — a measure of spread showing how much data values deviate from the mean.
- IQR (Interquartile Range) — the range between the first and third quartiles; measures spread of the middle 50% of data.
- Skewed Distribution — a distribution where data are not symmetric and have a longer tail on one side.
- Symmetric Distribution — a distribution where the left and right sides are mirror images.
Action Items / Next Steps
- Practice identifying distribution shapes and selecting appropriate center and spread measures.
- Review previous sections (3.1 and 3.3) for foundation concepts.