Coconote
AI notes
AI voice & video notes
Try for free
📊
Understanding Data Variability and Spread
Mar 26, 2025
Lecture Notes on Measures of Spread in Data
Introduction
Purpose
: Understand how to use numbers to describe the spread of data.
Variability
: Measure of differences within data samples.
Low variability: data points are similar.
High variability: data points are different.
Measures of Spread
Maximum and Minimum
Easiest measure of spread.
Represents the range of data values.
Sensitive to outliers (e.g., a single large outlier can shift max/min significantly).
Quartiles
Median (Q2)
: Middle value of the ordered data.
First Quartile (Q1)
: Value between the lowest value and the median.
Third Quartile (Q3)
: Value between the median and the highest value.
Quartiles are robust to outliers.
Data split into quarters, each containing 25% of the data.
Variance
Measures the mean of squared deviations from the mean.
Population variance uses 'N'; sample variance uses 'N-1' (degrees of freedom).
N-1
: Corrects for statistical bias when estimating a population from a sample.
Provides an average squared deviation, which is less intuitive.
Standard Deviation (SD)
Square root of variance.
Easier to interpret as average deviation from the mean.
Most data falls within one SD from the mean, outliers are beyond two or three SDs.
Identifying Outliers
1.5 IQR Rule
: Identifies potential outliers.
Calculate IQR (Q3 - Q1), multiply by 1.5.
Maximum cutoff: Q3 + (1.5 * IQR).
Minimum cutoff: Q1 - (1.5 * IQR).
Values outside these cutoffs are potential outliers.
Handling Outliers
Possible Causes
:
Typographical errors.
Measurement errors.
Incorrect identification of the sample.
Legitimate extreme values.
Deciding on Outliers
:
Inclusion
: Considered valid data within population.
Exclusion
: If the data point is not representative or caused by error.
Decision should be case-specific and explained.
Summary
Use numbers to describe data spread like central tendency.
Measures of spread show participant variability.
Important tools in identifying and handling outliers.
📄
Full transcript