Understanding Statistics Through Basketball

Aug 26, 2024

Introduction to Statistics Lecture

Presenter

  • Justin Zeltzer from zstatistics.com

Overview

  • Challenge: Explain statistics in under half an hour
  • Aimed at developing intuition around statistics
  • Examples themed around the NBA

Types of Data

Categorical Data

  • Nominal Categorical Data: No order to categories (e.g., sports teams)
  • Ordinal Categorical Data: Ordered categories (e.g., player positions in basketball: Guard, Forward, Center)

Numerical Data

  • Discrete Numerical Data: Countable values (e.g., number of free throws missed)
  • Continuous Numerical Data: Any value in a range (e.g., player's height)

Proportions

  • Percentages as numerical summaries of nominal data
  • Example: Steph Curry's three-point percentage

Distributions

  • Probability Density Function: Describes data distribution or probability of selecting a random sample
  • Normal Distribution (Bell Curve): Bulk of data is in the middle
  • Uniform Distribution: Equal probability across outcomes
  • Bimodal Distribution: Two peaks in data
  • Skewed Distribution: Tail direction indicates skewness (e.g., left skew)

Sampling Distributions

  • Distribution of sample averages
  • Larger samples reduce variance of the average

Estimation

  • Sample Statistic: An estimate of an unknown parameter (e.g., Steph Curry’s 3-point percentage is an estimate for his true skill level)
  • Confidence Intervals: Provide range of where the true parameter may lie

Parameters in Statistics

  • Mu (μ): Mean of a numerical variable
  • Sigma (σ): Standard deviation
  • Pi (π): Proportion of a categorical variable
  • Rho (ρ): Correlation between variables
  • Beta (β): Gradient in regression

Hypothesis Testing

  • Null Hypothesis (H0): Default assumption (e.g., player’s performance is random)
  • Alternate Hypothesis (H1): Assumes an effect or difference exists
  • Rejection Region: Area beyond which the null hypothesis is rejected
  • Level of Significance: Commonly 5%

P-Values

  • Measure how extreme the sample is compared to the null hypothesis
  • Small p-values suggest rejecting the null hypothesis
  • P-Hacking: Testing multiple hypotheses to find significant results by chance

Conclusion

  • Introduction to key statistical concepts using basketball examples
  • Emphasized intuition-building for understanding statistics

Extra Section: P-Hacking

  • P-Hacking: Misuse of p-values by testing multiple hypotheses and only reporting significant results
  • Problems with P-Hacking: Increased likelihood of false positives in research

Additional Resources

  • More detailed videos and discussions available on zstatistics.com