Coconote
AI notes
AI voice & video notes
Export note
Try for free
Introduction to Statistics
Jul 23, 2024
Lecture: Introduction to Statistics by Justin Zeltzer
Overview
Goal: Explain statistics in under half an hour without math
Audience: Beginners in statistics; those interested in basic concepts
Theme: Examples from the NBA basketball
Types of Data
Categorical Data
Nominal: No order
Example: Teams Steph Curry can play for
Ordinal: Ordered categories
Example: Positions in basketball (Guard, Forward, Center)
Numerical Data
Discrete: Finite numbers
Example: Number of free throws missed
Continuous: Infinite subdivisions
Example: Player’s height (e.g., 191.3 cm)
Proportions
Proportions as Numerical Data:
Example: Steph Curry’s three-point percentage
Percentages are proportions out of 100
Built from nominal data (made/missed shots)
Discussion on whether proportions are discrete or continuous
Distributions
Probability Density Function (PDF)
Example: Height distribution of NBA players (Isaiah Thomas 5’9”, Boban Marjanovic 7’3”)
Normal Distribution (Bell Curve):
Most players around average height
Other Types:
Uniform Distribution:
Equal probability across all values
Bimodal Distribution:
Two peaks
Skewed Distribution:
Long tail in one direction (left or right skew)
Sampling Distributions
Concerned with the distribution of sample means rather than individual measurements
Law of Large Numbers:
Larger sample sizes yield average values closer to the population mean, reducing variance
Estimation and Inference
Sample Statistic vs Parameter:
Sample provides an estimate, not the precise population value
Example: Steph Curry’s three-point shooting percentage (Theta, Θ)
Confidence Intervals:
Provide a range where the true parameter likely falls
Individual player’s confidence intervals (e.g., Meyers Leonard vs Steph Curry)
Parameters and Statistics
Parameters (Greek Letters):
Mu (μ):
Mean
Sigma (σ):
Standard deviation
Pi (π):
Proportion
Rho (ρ):
Correlation
Beta (β):
Gradient in regression
Sample Statistics (Roman letters):
X-bar (x̄):
Sample mean
S:
Sample standard deviation
P:
Sample proportion
R:
Sample correlation
B:
Sample gradient
Hypothesis Testing
Null Hypothesis (H₀):
Assumes no effect or status quo
Alternative Hypothesis (H₁):
What we seek evidence for (e.g., player shooting above 50%)
Example:
Meyers Leonard’s three-point shooting
Rejection Region:
Area where sample is extreme enough to reject H₀
Significance Level (α):
Often set at 0.05 (5%)
Key Points on Hypothesis Testing
Never say “prove” or “accept” in conclusions
Conclusions vary on evidence to reject or not reject H₀
Analogy with judicial system: not enough evidence ≠ innocence
p-Values
Definition:
Measures the extremity of the sample statistic
Interpretation:
Smaller p-value means more extreme and more evidence against H₀
Relation to rejection region and significance level (α)
Ethical Considerations in Research
P-Hacking
Definition:
Manipulating data collection or analysis to achieve desired p-values
Good Research:
Theorize, collect, test targeted effect
Bad Research:
Collect data broadly, test many effects, report significant ones
Problem:
Increases likelihood of finding false positives
Additional Resources
Further learning available at zstatistics.com
Conclusion
Summary of key concepts with NBA examples
Aim to provide intuition and understanding of statistical principles
📄
Full transcript