Introduction to Statistics - Key Concepts
Presenter: Justin Zeltzer from zstatistics.com
Overview
- Challenge: Explain statistics in under 30 minutes
- Objective: Develop intuition around statistics without complex math
- Themed examples: NBA (National Basketball Association)
Types of Data
-
Categorical Data
- Nominal: No intrinsic order (e.g., NBA team names)
- Ordinal: Ordered categories (e.g., Player positions: Guard, Forward, Center)
-
Numerical Data
- Discrete: Countable values (e.g., missed free throws)
- Continuous: Measurable values (e.g., player height)
Proportions and Percentages
- Proportions are numerical summaries of nominal data (e.g., Steph Curry’s 3-point percentage)
- Discussion: Discrete vs. Continuous nature of proportions
Distributions
- Probability Density Function (PDF): Describes distribution of data
- Normal Distribution (Bell Curve): Common in statistics, shows bulk of data around the mean
- Other Distributions:
- Uniform (flat distribution)
- Bimodal (two peaks)
- Skewed (left/right tails)
Sampling Distributions
- Describes probability distribution of sample statistics (e.g., average height)
- Larger sample sizes lead to less variance in sample means
Estimation
- Sample Statistic: Example - Steph Curry's 3-point percentage (sample estimate)
- Parameter Estimation: Attempt to estimate true value (e.g., using confidence intervals)
Parameters and Statistics
- Common Greek letters used for parameters:
- Mu (μ): Mean
- Sigma (σ): Standard deviation
- Pi (π): Proportion
- Rho (ρ): Correlation
- Beta (β): Gradient in regression
Hypothesis Testing
- Null Hypothesis (H₀): Default assumption (e.g., ≤ 50% shooting)
- Alternate Hypothesis (H₁): What we seek to prove (e.g., > 50% shooting)
- Rejection Region: Area where sample is too extreme under H₀
- P-Value: Measure of sample extremity; if < 0.05, often leads to rejecting H₀
Important Notes on Hypothesis Testing
- Never "prove" or "accept" hypotheses; only infer
- Non-rejection suggests insufficient evidence against H₀
P-values and P-hacking
- P-value: Indicates extremity of sample statistic
- P-hacking: Misuse by testing multiple hypotheses and reporting only significant results
Conclusion
- Statistics does not prove but aids in inference
- Importance of understanding and applying statistical tests properly
For more detailed content and mathematical explanations, visit ZedStatistics.com.