AP Statistics Comprehensive Notes
Unit 1: Introduction to Data
Types of Data
- Categorical Data: Names and labels (e.g., eye color, hair color).
- Quantitative Data: Numerical values (e.g., height, class size).
Representing Categorical Data
- Use Two-Way Tables to display relationships between categorical variables.
- Marginal Relative Frequency: Percentage of data in a single row/column compared to the total.
- Joint Relative Frequency: Percentage of data in a single group compared to the total.
- Conditional Relative Frequencies: Percentage of data in a single category when given a specific group.
Describing Quantitative Data
- Use the acronym CSOCCS:
- C: Context
- S: Shape
- O: Outliers
- C: Center
- C: Spread
- S: Summarize
Basic Statistical Terms
- Mean: Average value.
- Standard Deviation: Measure of variation from the mean.
- Median: 50th percentile value.
- Range: Difference between maximum and minimum values.
Box Plots and Outliers
- Five Number Summary: Minimum, Q1, Median, Q3, Maximum.
- IQR (Interquartile Range): Q3 - Q1.
- Outliers: Low-end < Q1 - 1.5IQR; High-end > Q3 + 1.5IQR.
Transformations and Distributions
- Effects of Additions/Subtractions/Multiplications on Data Shape, Center, and Variability.
- Density Curves and the Normal Distribution.
- 68-95-99.7 Rule for standard deviations.
- Utilize calculators for normal distribution problems.
Unit 2: Describing Relationships
Scatterplots and Correlation
- Use the acronym CSDFS: Context, Strength, Direction, Form, and State Outliers.
- Correlation Coefficient (R): Ranges from -1 to 1, describing the strength and direction of a linear relationship.
Regression Lines
- LSRL (Least Squares Regression Line) and interpreting slope and intercepts.
- Concept of Residuals: Difference between observed and predicted values.
Outliers and Influences
- Effect of outliers on R, and regression lines.
- Residual Plots: Determine linear relationships.
Unit 3: Collecting Data
Sampling Methods
- SRS (Simple Random Sample), Stratified Sampling, Cluster Sampling, and Systematic Sampling.
- Bias: Convenience sampling and voluntary response bias.
Observational Studies vs Experiments
- Principles of comparing, random assignment, control, and replication.
- Blinding and Blocking.
Experimental Design
- Matched Pairs and Randomized Block Design.
Unit 4: Probability and Random Variables
Probability Concepts
- Definitions: Mutually Exclusive and Independence.
- Probability rules and calculations.
Random Variables
- Discrete vs Continuous Random Variables.
- Calculation of expected values, mean, and standard deviations.
Binomial and Geometric Distributions
- BINS: Binary, Independent, Number of trials, Success probability.
- Calculations and using calculators for probability distributions.
Unit 5: Sampling Distributions
Sampling Distributions
- Statistics vs Parameters: Sample vs Population.
- Understanding Unbiased Estimators.
- Central Limit Theorem: Sample size impacts on distribution.
Unit 6: Inference for Categorical Data: Proportions
Confidence Intervals and Hypothesis Testing
- PANIC and PHANTOMS procedures for calculating confidence intervals and tests.
- Interpretation of results in context.
Unit 7: Inference for Quantitative Data: Means
Procedures and Conditions
- Use of T-Distribution and calculating degrees of freedom.
- Similar steps for confidence intervals and hypothesis testing as seen in unit 6 but with means.
Unit 8: Inference for Categorical Data: Chi-Squared Tests
Types of Chi-Squared Tests
- Goodness of Fit: Comparing observed to expected distributions.
- Homogeneity and Independence Tests: Exploring relationships between categorical variables.
Unit 9: Inference for Slope
Linear Regression Inference
- Testing linear relationships and using linear regression models.
- Using t-distributions for slope inference.
- Detailed procedures for confidence intervals and hypothesis testing for slopes.
These notes serve as a comprehensive guide through AP Statistics, covering data collection, probability, inferential statistics, and regression analysis.