📈

Understanding Standard Deviation and Outliers

Mar 28, 2025

Lecture Notes: Standard Deviation and Outliers

Importance of Standard Deviation

  • Essential concept for the course.
  • Frequent calculations required.

Definition of Outliers

  • Outlier: A value outside the usual range of data values.
  • Important for identifying errors or interesting data points.

Example

  • Small cup of coffee price list: $2.39, $2.99, $3.09, $259 (no decimal).
  • $259 likely an error or an interesting data point (luxury coffee).
  • Importance of reviewing suspicious data points.

Identifying Outliers

  • Not always obvious (e.g., the $259 coffee price is clear, others may be less so).
  • No universal rule; methods vary among statisticians.

Class Definition of Outliers

  • Rule: Data value more than two standard deviations away from the mean is considered an outlier.

Understanding Standard Deviation

  • Mean: Central value of data.
  • Standard Deviation: Average deviation from the mean.
  • One Standard Deviation:
    • Above mean: ( \mu + \sigma )
    • Below mean: ( \mu - \sigma )

Two Standard Deviations

  • Data beyond two standard deviations is rare.
  • Two Standard Deviations Range:
    • Above mean: ( \mu + 2\sigma )
    • Below mean: ( \mu - 2\sigma )

Outliers

  • Almost all data lies within two standard deviations of the mean.
  • Values beyond this range are considered outliers.
  • Finding Outliers:
    • Calculate mean and standard deviation.
    • Determine boundaries (two standard deviations above and below mean).
    • Values outside these boundaries are outliers.

Conclusion

  • Identifying outliers helps in recognizing errors or understanding significant data points.
  • Practical application in next examples.