📊

Introductory Statistics Concepts Explained

Apr 26, 2025

Introduction to Statistics with Justin Zeltzer

Overview

  • Aim to explain statistics intuitively in under 30 minutes.
  • Ideal for beginners or those curious about statistics.
  • Examples themed around NBA data.

Types of Data

  • Categorical Data: Divided into
    • Nominal: No order (e.g., NBA teams)
    • Ordinal: Has order (e.g., player positions)
  • Numerical Data: Divided into
    • Discrete: Distinct values (e.g., missed free throws)
    • Continuous: Infinite possibilities (e.g., height)

Proportions

  • Percentages and proportions are similar but expressed differently.
  • Example: Steph Curry's three-point percentage.
    • Built from nominal data (made/missed shots) to a numerical summary.

Distributions

  • Normal Distribution: Major distribution pattern in statistics.
    • Example: Heights of NBA players.
    • Symmetrical bell curve, with most data around the mean.
  • Other Distributions:
    • Uniform Distribution: Equal probability across values.
    • Bimodal Distribution: Two peaks.
    • Skewed Distribution: Asymmetry in data distribution.

Sampling Distributions

  • Looking at averages from random samples rather than individual data points.
  • Larger sample sizes reduce the variance and make distributions "skinnier."

Sampling and Estimation

  • Sample Statistic: An estimate of a population parameter (e.g., Steph Curry’s shooting percentage is a sample estimate of his true ability).
  • Confidence Intervals: A range where the true parameter value lies with a certain probability (e.g., 95%).
  • Larger samples produce more reliable estimates.

Parameters and Sample Statistics

  • Parameters (Greek symbols): True values, unknowable
    • Mu (μ): Mean
    • Sigma (σ): Standard deviation
    • Pi (π): Proportion
    • Rho (ρ): Correlation
    • Beta (β): Gradient
  • Sample Statistics (Roman symbols): Estimates derived from data
    • X-bar (x̄): Sample mean
    • S: Sample standard deviation
    • P: Sample proportion
    • R: Sample correlation

Hypothesis Testing

  • Null Hypothesis (H₀): Initial assumption (no effect/change).
  • Alternate Hypothesis (H₁): What you seek evidence for.
  • Use statistical tests to determine if sample data is extreme enough to reject H₀.
  • Significance Level: Often set at 5%, the threshold for rejecting H₀.
  • Never "prove" anything; only infer based on the evidence.

P-Values

  • Measure how extreme the sample data is under the null hypothesis.
  • A small p-value (< 0.05) suggests rejecting H₀.
  • Larger p-values indicate insufficient evidence to reject H₀.

Issues in Statistical Research

  • P-hacking: Manipulating data/tests to find significant results.
  • Testing multiple hypotheses increases chances of finding false positives.

Conclusion

  • Understanding basic statistics concepts like data types, distributions, sampling, and hypothesis testing is crucial for interpreting statistical results and conducting research.
  • Further resources and more detailed explanations available at zstatistics.com.