Overview
This lecture introduces probability distributions, sampling distributions, and hypothesis testing, using Dungeons & Dragons and tuna cans as relatable examples.
Probability Distributions
- Probability distributions describe the likelihood of different outcomes during a random process.
- The uniform distribution models scenarios where all outcomes are equally likely, like rolling a fair die.
- The probability of rolling a 5 or 6 on a single six-sided die is 2/6 or about 33%.
- For continuous outcomes (e.g., rolling all numbers between 1 and 6), the probability of an interval is found by shading the area under the distribution curve.
- The normal distribution (bell curve) describes sums of many random variables, like the total from rolling multiple dice.
- Probabilities for ranges in a normal distribution can be found by calculating area under the curve.
The Standard Normal Distribution and Z-Scores
- The standard normal distribution has a mean of 0 and standard deviation of 1.
- To compare values from a normal distribution, convert them to z-scores: (value – mean) / standard deviation.
- Z-tables provide the area (probability) to the left of a given z-score.
- Example: If the mean turn time is 3 minutes (SD = 1), the chance someone finishes in under 2 minutes is about 16%.
Sampling Distributions
- The true mean of a population is often unknown, so use a representative sample to estimate.
- A sample statistic (like the sample mean) varies from sample to sample.
- The sampling distribution is the distribution of a statistic (e.g., mean) from multiple samples.
- The standard error is the standard deviation of the sampling distribution and is estimated by dividing the sample standard deviation by the square root of the sample size.
- With a large number of samples, sample means cluster around the population mean and form a normal distribution.
Hypothesis Testing and P-values
- Use sampling distributions to test claims (hypotheses) about population parameters.
- A test statistic (z-score) shows how far the sample mean is from the hypothesized mean in standard deviation units.
- The p-value is the probability of observing a result at least as extreme as the sample, under the hypothesis.
- A low p-value indicates that the observed outcome is unlikely if the hypothesis is true, suggesting the hypothesis may be false.
Key Terms & Definitions
- Probability Distribution — Describes the likelihood of all possible outcomes in a random process.
- Uniform Distribution — A probability distribution where all outcomes are equally likely.
- Normal Distribution — A bell-shaped probability distribution characterized by a mean and standard deviation.
- Standard (Normal) Distribution — A normal distribution with mean 0 and standard deviation 1.
- Z-score — Number of standard deviations a value is from the mean.
- Sampling Distribution — Distribution of a statistic (like the mean) over many samples from a population.
- Standard Error — The standard deviation of a sampling distribution.
- Hypothesis Testing — A method to test assumptions about population parameters using sample data.
- Test Statistic — A standardized value (like a z-score) used in hypothesis testing.
- P-value — Probability of observing a statistic as extreme as, or more extreme than, the sample, assuming the hypothesis is true.
Action Items / Next Steps
- Practice calculating probabilities using both uniform and normal distributions.
- Review how to standardize data and use z-tables for probability calculations.
- Study the steps and reasoning behind hypothesis testing, including calculation and interpretation of p-values.