📊

Comprehensive AP Statistics Lecture Notes

May 8, 2025

AP Statistics Unit 1-9 Lecture Notes

Unit 1: Introduction to Statistics

Data Types

Quantitative Data:
- Deals with numbers (e.g., heights, class size, population).
- Represented numerically.
Categorical Data:
- Deals with names and labels (e.g., eye color, hair color).
- Cannot be numerically represented.
- Represented with two-way tables.

Two-Way Tables

Used to represent categorical data.
Shows intersections between variables.
Marginal Relative Frequency: Percentage of data in a single row or column vs total.
Joint Relative Frequency: Percentage of data in a single group vs total.
Conditional Relative Frequencies: Percentage in a single category given a specific group.

Quantitative Data

Described using the acronym "C-SOCS":
- Context
- Shape: Symmetrical, skewed, number of peaks.
- Outliers: Identify any extreme data points.
- Center: Mean or median.
- Spread: Range, standard deviation, IQR.

Basic Statistical Terms

Mean: Average of all values.
Standard Deviation: Measure of variation.
Median: 50th percentile.
Range: Difference between max and min values.

Box Plots

Use the five-number summary: Minimum, Q1, Median, Q3, Maximum.
IQR (Interquartile Range): Q3 - Q1.
Identifies low-end and high-end outliers.

Percentiles and Frequency

Percentile: Percentage of values below a certain point.
Cumulative Relative Frequency: Cumulative percentages from intervals.

Z-scores

Measures the number of standard deviations a value is from the mean.

Data Transformation

Addition/Subtraction: Shape and variability remain the same; center changes.
Multiplication/Division: Center and variability change proportionally.

Density Curves and Normal Distribution

Density Curve: Shows probability distribution with an area of one.
Normal Distribution: Follows the 68-95-99.7 rule.

Calculator Commands for Normal Distribution

normPDF: Finds probability at a specific value.
normCDF: Probability between an interval.
Inverse Normal: Finds value for a given percentile.

Normal Probability Plot

Plots actual vs theoretical Z-values to check normality.

Unit 2: Correlation and Regression

Scatter Plots and Correlation

Acronym SEED:
- State in context.
- Examine direction (positive/negative).
- Outliers noted.
- Form (linear/nonlinear).
- Strength of correlation.

R Value (Correlation Coefficient)

Ranges from -1 to 1.
Closer to -1 or 1 indicates stronger correlation.

Regression Lines

Best Fit Line (y-hat): Predicts values based on the slope and intercept.
Residuals: Difference between actual and predicted values.

Least Squares Regression Line

Minimizes sum of squared residuals.
S Value: Average distance predicted values are from LSR.
R-square Value: Coefficient of determination.

Impact of Outliers

Affects slope and y-intercept of regression lines.

Residual Plots

Used to determine if linear functions are best.
Clear pattern implies linear function may not be best.

Unit 3: Sampling Methods

Common Sampling Methods

Simple Random Sample (SRS): Equal chance for all.
Stratified Random Sample: Divides population by shared traits.
Cluster Sample: Divides into clusters, samples entire clusters.
Systematic Random Sample: Regular intervals with a random start.

Poor Sampling Methods

Convenience Sample: Chooses easy-to-reach individuals.
Voluntary Response Sampling: Participants choose to join.

Shortcomings and Biases

Undercoverage: Leaving out groups.
Non-response Bias: Selected individuals don't respond.
Response Bias: False or misleading answers.
Wording Bias: Poorly phrased questions.

Observational Studies vs. Experiments

Observational Study: Observes without influencing subjects.
Experiment: Manipulates variables to measure effects.

Principles of Experimental Design

Comparison, Random Assignment, Control, Replication.

Terminology

Factors, Levels, Confounding.
Placebo: Fake treatment.
Blinding: Single and Double Blind.

Experimental Designs

Randomized Block Design: Divides subjects into blocks and assigns treatments.
Matched Pairs Design: Pairs subjects based on characteristics.

Unit 4: Probability and Random Variables

Probability Basics

Probability ranges from 0 to 1.
Long-term vs. short-term unpredictability.

Simulation

Uses models to mimic real-world events.
Four-step process: Define problem, model, perform, estimate probability.

Probability Rules

Mutually Exclusive: Events can't occur together.
Independence: Outcome of one doesn’t affect the other.
Addition Rule: For mutually exclusive and non-mutually exclusive events.
Multiplication Rule: For independent and non-independent events.
Complement Rule: Probability of not occurring is 1 minus probability of occurring.

Visualization of Probability

Venn diagrams, two-way tables, probability trees.

Random Variables

Discrete: Countable values.
Continuous: Any value within a range.
Binomial: Fixed number of trials.
Geometric: Trials until the first success.

Mean and Standard Deviation of Random Variables

Use calculator for one-variable stats.
Discrete example calculation.

Transforming Probability Distributions

Addition/Subtraction: Shape unchanged, center changes.
Multiplication/Division: Shape unchanged, center and variability change.

Binomial Random Variables

Acronym BINS: Binary, Independent, Number of trials, Set probability.

Calculator Commands for Binomial Random Variables

binomialPDF: Probability of exactly x successes.
binomialCDF: Cumulative probability up to x.

Geometric Random Variables

Models number of trials until first success.
Calculator commands: geometricPDF, geometricCDF.

Unit 5: Sampling Distributions

Statistic vs. Parameter

Statistic: From a sample.
Parameter: From a population.

Sampling Distribution

Probability distribution obtained through repeated sampling.
Unbiased estimator: Statistic's distribution equals parameter.

Proportions

Repeated samplings, plotting sample proportions.
Conditions: Random sampling, 10% condition, large counts.

Calculating Z-scores

Used to find probabilities of sample proportions.

Central Limit Theorem

Sampling distribution is normal if sample size ≥ 30.

Unit 6: Inference for Proportions

Point Estimate

Statistic estimating population parameter.

Confidence Interval vs. Confidence Level

Interval: Range of values estimating parameter.
Level: Probability parameter falls within a range.

Confidence Interval Interpretation

We are X% confident the interval from Y to Z captures the true parameter in context.

Factors Affecting Confidence Interval

Higher confidence level = wider interval.
Larger sample size = narrower interval.
Bias does not affect margin of error.

Panic for Confidence Interval

Parameter, Assumptions, Name of test, Interval, Conclusion.

Confidence Interval Calculation

Use calculator or formula.

Significance Tests

Phantoms:
- Parameter, Hypothesis, Assumptions, Name of test, Test statistic, Obtain p-value, Make decision, State conclusion.

Type I & II Errors

Type I: False positive; null hypothesis is true but rejected.
Type II: False negative; null hypothesis is false but not rejected.
Power: Probability of correctly rejecting a false null hypothesis.

Unit 7: Inference for Means

Key Differences

Use T instead of Z for mean.
Degrees of freedom: n - 1.

Panic Procedure

Same as proportions but with means.

Interpretation

Similar to proportions but refers to means.

Significance Tests

Similar to proportions but for means.

Unit 8: Chi-Squared Tests

Chi-Squared Tests Overview

Tests for goodness of fit, homogeneity, and association/independence.

Goodness of Fit

Compares observed distribution to expected.
No parameter/sample statistic.

Homogeneity

Compares distributions across multiple groups for a single variable.

Association/Independence

Tests for association between two variables in one sample.

Unit 9: Inference for Slope

Confidence Intervals and Significance Tests

For true population slope.
Check assumptions for linearity, independence, equal variance, normality.

Degrees of Freedom

n - 2 for slope inference.

Conclusion

Always in context.

These notes provide a comprehensive review of the major concepts across various units in AP Statistics, serving as a helpful study guide for the course.

Full transcript