Crash Course Statistics: Measures of Central Tendency
Introduction
- Presenter: Adriene Hill
- Topic: Measures of central tendency (averages, medians, modes)
- Importance of middle numbers in statistics and their influence in reports and debates.
Mean (Average)
- Definition: Sum of all data points divided by the number of data points.
- Example:
- If 10 pregnant dogs give birth to 50 total puppies, the average litter size is 5 puppies.
- If you and a friend have $10 and $20 respectively, the mean is $15.
- Normal Distribution: Symmetrical bell-curve where data points are evenly distributed around the mean.
- Misleading Mean: Can be skewed by outliers, e.g., life expectancy in the Middle Ages.
Median
- Definition: Middle value when the data set is ordered from smallest to largest.
- Example:
- With data points: 1, 2, 3 – the median is 2.
- With an even number of data points: 1, 2, 3, 4 – the median is (2+3)/2 = 2.5.
- Resilience to Outliers: Less affected by extremely large or small values.
- Example with Outliers: Median doesn't change much, even if a high-value outlier (e.g., Elon Musk) is added to the group.
Mode
- Definition: Most frequently occurring value(s) in a data set.
- Example:
- For Amazon book reviews with 200 5-star and 200 1-star ratings, both 1 and 5 are modes (bimodal data).
- Multimodal Data: Data sets with more than one mode.
- **Use Cases: Affects the understanding of data distributions, e.g., lunch vs. dinner rush at a restaurant.
- Non-numeric Data: Can be used for categorical data, e.g., favorite colors.
Skewness in Data
- Normal Distribution: Mean, median, and mode are the same.
- Skewed Distribution: When the mean and median differ because of extreme values (outliers).
- Example:
- Mean income rose while median income fell post-financial crisis, indicating income rise only at the top.
Conclusion
- Statistics: Can be both true and deceptive.
- Critical Thinking: Importance of understanding the context and the questions being answered by statistics.
- Practical Advice: Use common sense and skepticism in interpreting statistics.
Endnote: Thank you for watching!