AP Stats Unit 1 Summary

Overview

This lecture provides a comprehensive summary of AP Statistics Unit 1, focusing on exploring one-variable data, covering categorical and quantitative variables, data displays, summary statistics, comparison methods, and the normal distribution.

Types of Data and Variables

Data can be from a sample (statistic) or an entire population (parameter).
Individuals are the subjects from which data is collected; can be people, objects, or other entities.
Variables are characteristics that differ among individuals.
Variables are categorized as categorical (group labels, words) or quantitative (measured/countable numbers).
Discrete quantitative variables have countable values; continuous quantitative variables can take any value in an interval.

Categorical Data Representation

Categorical data is summarized using frequency tables (counts) and relative frequency tables (proportions).
Main graphical displays: bar graphs (frequencies or proportions) and pie charts (proportions).
Describing distributions for categorical data focuses on most/least common categories and comparing groups.

Quantitative Data Representation

Quantitative data is displayed with dot plots, stem-and-leaf plots, histograms, and cumulative frequency graphs.
Histograms use bins (equal intervals); heights show frequency or proportion (relative frequency histogram).
Cumulative frequency graphs plot cumulative proportions below certain values.
When describing distributions: mention shape (symmetric, skewed, unimodal, uniform), center, spread, and outliers.

Measures of Center, Position, and Spread

Mean (average) is affected by outliers; median is the middle value and resistant to outliers.
Median location: (n + 1) / 2 with ordered data.
Mean ≈ median when data is symmetric; mean < median (skewed left), mean > median (skewed right).
Percentiles show relative position (e.g., median is the 50th percentile, quartiles are 25th/75th percentiles).
Range = max − min; IQR = Q3 − Q1, measuring the middle 50% spread.
Standard deviation quantifies average distance from the mean.

Outliers and Data Transformation

Outliers: use "fence method" (Q1 − 1.5×IQR, Q3 + 1.5×IQR) or 2 standard deviations from the mean.
Adding/subtracting a constant changes measures of center and position, not spread.
Multiplying all data values by a constant changes center, spread, and position measures proportionally.

Boxplots and Comparing Distributions

Five-number summary: min, Q1, median, Q3, max; used to make (modified) boxplots.
Boxplots visualize spread, center, outliers, and shape (e.g., skewness).
When comparing two distributions: compare shape, center, spread, and outliers, and use context-specific language.

Normal Distribution and Z-scores

Some quantitative data fit a normal (bell-shaped, symmetric) distribution described by mean (μ) and standard deviation (σ).
Empirical Rule: 68% of data within 1σ, 95% within 2σ, and 99.7% within 3σ of the mean.
Z-score formula: (value − mean) / standard deviation; measures how many σ above/below mean.
Use Z-scores and technology/tables to find proportions below, above, or between values in a normal distribution.
Find percentiles or cutoff values by working backwards from a given proportion using Z-scores.

Key Terms & Definitions

Statistic — summary measure from sample data.
Parameter — summary measure from a full population.
Categorical variable — data in categories or group labels.
Quantitative variable — data as numbers measured or counted.
Discrete variable — quantitative, countable values.
Continuous variable — quantitative, infinite possible values in interval.
Frequency table — counts of occurrences in each category or interval.
Relative frequency — proportion or percent in a category or interval.
Histogram — bar graph for quantitative data using intervals (bins).
Dot plot/Stem-and-leaf plot — shows individual values of quantitative data.
Mean (x̄) — arithmetic average.
Median — middle value in ordered data.
Percentile — percentage of data at or below a value.
Quartiles (Q1, Q3) — divide data into four equal parts.
Interquartile range (IQR) — Q3 minus Q1.
Standard deviation (σ or s) — typical distance from the mean.
Boxplot — graph of five-number summary and outliers.
Normal distribution — symmetric, bell-shaped curve for quantitative data.
Z-score — standardized value: (value − mean) / standard deviation.

Action Items / Next Steps

Review and complete the Unit 1 study guide.
Practice describing and comparing distributions using graphs and summary statistics.
Practice normal distribution calculations with Z-scores on calculator or Desmos.
Prepare for Unit 1 test and continue studying with the provided review materials.