📊

Understanding Sample Mean, Standard Deviation, and Outliers

Mar 28, 2025

Lecture Notes on Sample Mean, Standard Deviation, and Outliers

Key Concepts

We have data on distances between 46 retail stores and a central Distribution Center.
The objective is to calculate the sample mean, sample standard deviation, and identify outliers in the data set.

Sample Mean Calculation

Data Handling:
- Data consists of distances in miles from 46 stores to a distribution center.
- This is a sample, not a population.
Process:
- Sum up all data values.
- Divide the sum by the number of data points (n = 46) to get the mean.
- Store result as X̄ (sample mean).
Result:
- Sample mean (X̄) = 197.2826 miles.

Sample Standard Deviation Calculation

Process:
- Calculate differences between each data value and the mean (X - X̄).
- Square each difference to eliminate direction.
- Sum these squared differences.
- Divide by n-1 (since it’s a sample): This is because we're using Bessel's correction.
- Take the square root of the result to find the standard deviation.
Result:
- Sample standard deviation (s) = 32.4884 miles.

Finding Outliers

Definition:
- Outliers are data points that lie outside two standard deviations from the mean.
Bounds Calculation:
- Upper bound = X̄ + 2s = 262.2594 miles.
- Lower bound = X̄ - 2s = 132.3058 miles.
Outliers Identification:
- Identify data values not between the bounds (132.3058 and 262.2594).
Results:
- Lower end: 132 is an outlier (below the lower bound).
- Upper end: 277 is an outlier (above the upper bound).

Conclusion

Outliers identified are 132 and 277 miles, which deviate significantly from the average range.
Ensure data is in order for efficient outlier identification.
Remember to use full precision for calculations to avoid rounding errors.

Tips

Use print function to display multiple calculations but be aware of artifact in displaying the last line twice.
Confirm outlier values by checking both ends of the ordered data set.

Full transcript