Understanding Key Concepts in Statistics

Sep 4, 2024

Introduction to Statistics - Key Concepts

Presenter: Justin Zeltzer from zstatistics.com

Overview

  • Challenge: Explain statistics in under 30 minutes
  • Objective: Develop intuition around statistics without complex math
  • Themed examples: NBA (National Basketball Association)

Types of Data

  1. Categorical Data

    • Nominal: No intrinsic order (e.g., NBA team names)
    • Ordinal: Ordered categories (e.g., Player positions: Guard, Forward, Center)
  2. Numerical Data

    • Discrete: Countable values (e.g., missed free throws)
    • Continuous: Measurable values (e.g., player height)

Proportions and Percentages

  • Proportions are numerical summaries of nominal data (e.g., Steph Curry’s 3-point percentage)
  • Discussion: Discrete vs. Continuous nature of proportions

Distributions

  • Probability Density Function (PDF): Describes distribution of data
  • Normal Distribution (Bell Curve): Common in statistics, shows bulk of data around the mean
  • Other Distributions:
    • Uniform (flat distribution)
    • Bimodal (two peaks)
    • Skewed (left/right tails)

Sampling Distributions

  • Describes probability distribution of sample statistics (e.g., average height)
  • Larger sample sizes lead to less variance in sample means

Estimation

  • Sample Statistic: Example - Steph Curry's 3-point percentage (sample estimate)
  • Parameter Estimation: Attempt to estimate true value (e.g., using confidence intervals)

Parameters and Statistics

  • Common Greek letters used for parameters:
    • Mu (μ): Mean
    • Sigma (σ): Standard deviation
    • Pi (π): Proportion
    • Rho (ρ): Correlation
    • Beta (β): Gradient in regression

Hypothesis Testing

  • Null Hypothesis (H₀): Default assumption (e.g., ≤ 50% shooting)
  • Alternate Hypothesis (H₁): What we seek to prove (e.g., > 50% shooting)
  • Rejection Region: Area where sample is too extreme under H₀
  • P-Value: Measure of sample extremity; if < 0.05, often leads to rejecting H₀

Important Notes on Hypothesis Testing

  • Never "prove" or "accept" hypotheses; only infer
  • Non-rejection suggests insufficient evidence against H₀

P-values and P-hacking

  • P-value: Indicates extremity of sample statistic
  • P-hacking: Misuse by testing multiple hypotheses and reporting only significant results

Conclusion

  • Statistics does not prove but aids in inference
  • Importance of understanding and applying statistical tests properly

For more detailed content and mathematical explanations, visit ZedStatistics.com.