Unit 6 Summer Review: Inference for Categorical Data
Introduction
- Topic: Inference for categorical data with a focus on proportions.
- The lecture is a summary and not exhaustive of all details taught in class.
- Recommended to use the unit 6 study guide from the Ultimate Review Packet.
Key Concepts
- Inference for Proportions: Derived from categorical data.
- Example questions: Proportion of students who did homework? Difference between male and female students?
- Statistical Inference: Using sample statistics to make judgments about a population parameter.
- Sample statistic (e.g., 78% of students did homework) approximates population proportion, but not perfectly.
Inference Procedures
- Confidence Intervals
- Predicts what the population parameter could be.
- Significance Tests
- Tests the truth of a claim about a population parameter.
Confidence Intervals for Population Proportions
Steps to Construct
- Sample Analysis: Sample must be:
- Random (to avoid bias).
- Less than 10% of the population (to assume independence).
- Large enough (at least 10 successes and 10 failures).
- Point Estimate: Sample proportion (p-hat), not the true population proportion.
- Sampling Distribution: Contains all possible sample proportions, expected to be normal.
- Accuracy: 95% of sample proportions fall within 1.96 standard deviations of the true proportion.
Confidence Interval Formula
- General Formula:
p-hat ± z* x SE(p-hat)
SE(p-hat) (Standard error) uses p-hat instead of unknown p.*
Confidence Levels
- Different levels (90%, 95%, 99%) affect the width of the interval.
- Z-Score (z*) determination based on confidence level:
- 95% confidence uses z* = ±1.96.
Example Problem
- Random sample: 780 teachers, 82% have college loan debt.
- Construct interval with 98% confidence using steps:
- Name procedure.
- Check conditions (random, <10%, 10 successes/failures).
- Calculate interval.
- Interpret interval in context.
Interpretation
- Confidence Level Meaning:
- 98% confidence indicates that 98% of samples will contain the true parameter within the interval.
- Practical Application: Determine evidence supporting a claim (e.g., over 70% of teachers with loan debt).
Determining Sample Size for Desired Margin of Error
- Researchers can calculate the required sample size for a given margin of error and confidence level.
- Example calculation for sample size to achieve 2% margin of error.
Confidence Intervals for Difference in Proportions
- Two Sample Z Interval: Analyzes difference between two population proportions.
- Example: Difference between teacher and nurse debt proportions.
- Four-Step Method:
- Name procedure in context.
- Check conditions for both samples.
- Build the interval using sample data.
- Interpret in context.
- Interpretation: Positive interval suggests one group has a higher proportion.
Additional Considerations
- Interpretation of intervals with negative and positive ends.
- Using intervals to make justifications about population differences.
Conclusion
- Two Types of Confidence Intervals: One-sample for single population, two-sample for differences.
- AP Stats Formula Sheet: Provides necessary formulas for sampling distributions and standard errors.
In the next session, the focus will shift to significance tests for population proportions.