L7

Sep 19, 2024

Lecture Notes on Dispersion and Skewness

Review of Last Class

Discussed ways of quantifying dispersion in a population/sample.
Standard deviation:
- Population: ( \sigma = \sqrt{\sum (x - \mu)^2 / N} )
- Sample: ( s = \sqrt{\sum (x - \bar{x})^2 / (n - 1)} )
- ( n - 1 ) used in sample to provide a better estimate.

Practical Significance of Standard Deviation

Chebyshev's Theorem: At least (1 - \frac{1}{k^2}) of the data lies within (k) standard deviations.
- Example: (k = 2) yields 75% coverage.
Normal distribution:
- 68% within ±1 standard deviation.
- 95% within ±2 standard deviations.
- 99.7% within ±3 standard deviations.

Z-Score

Defined as ( \frac{x - \text{mean}}{\text{standard deviation}} ).
Used to detect outliers (e.g., (z)-score > 3).

Percentiles and Quartiles

Percentiles indicate position relative to a dataset.
Quartiles:
- Q1 (25%), Q2 (Median, 50%), Q3 (75%).
- Interquartile Range (IQR): ( Q3 - Q1 ).
Box Plot:
- Visual representation of data spread and outliers.
- Outliers lie outside "fence" values: ( Q1 - 1.5 \times IQR ) and ( Q3 + 1.5 \times IQR ).

Moments

Moments describe shape of distribution.
Moment about zero: ( M_r^* = \frac{\sum y^r}{n} ).
Moments about mean:
- First moment (mean): ( M_1 = 0 ).
- Second moment (variance): ( M_2 = \text{variance} ).
- Third moment (skewness): ( M_3 ).

Skewness

Skewness ( A_3 = \frac{M_3}{M_2^{3/2}} ).
Indicates asymmetry:
- A3 = 0 for symmetric distributions.
- A3 < 0 when distribution is skewed left.
- A3 > 0 when distribution is skewed right.

Example: Calculating Skewness

Provided example with numeric data to calculate skewness.
Noted that skewness indicates whether data is balanced around the mean.

Summary

Discussed measures of dispersion, significance of standard deviation, z-scores, percentiles, quartiles, and moments.
Skewness as a measure of asymmetry in data distribution.

Note: Always verify calculations and understand the implications of skewness and other statistical measures in data analysis.

Full transcript