4.1 How to Use Probability Distribution

Overview

This lecture introduces probability distributions, sampling distributions, and hypothesis testing, using Dungeons & Dragons and tuna cans as relatable examples.

Probability Distributions

Probability distributions describe the likelihood of different outcomes during a random process.
The uniform distribution models scenarios where all outcomes are equally likely, like rolling a fair die.
The probability of rolling a 5 or 6 on a single six-sided die is 2/6 or about 33%.
For continuous outcomes (e.g., rolling all numbers between 1 and 6), the probability of an interval is found by shading the area under the distribution curve.
The normal distribution (bell curve) describes sums of many random variables, like the total from rolling multiple dice.
Probabilities for ranges in a normal distribution can be found by calculating area under the curve.

The Standard Normal Distribution and Z-Scores

The standard normal distribution has a mean of 0 and standard deviation of 1.
To compare values from a normal distribution, convert them to z-scores: (value – mean) / standard deviation.
Z-tables provide the area (probability) to the left of a given z-score.
Example: If the mean turn time is 3 minutes (SD = 1), the chance someone finishes in under 2 minutes is about 16%.

Sampling Distributions

The true mean of a population is often unknown, so use a representative sample to estimate.
A sample statistic (like the sample mean) varies from sample to sample.
The sampling distribution is the distribution of a statistic (e.g., mean) from multiple samples.
The standard error is the standard deviation of the sampling distribution and is estimated by dividing the sample standard deviation by the square root of the sample size.
With a large number of samples, sample means cluster around the population mean and form a normal distribution.

Hypothesis Testing and P-values

Use sampling distributions to test claims (hypotheses) about population parameters.
A test statistic (z-score) shows how far the sample mean is from the hypothesized mean in standard deviation units.
The p-value is the probability of observing a result at least as extreme as the sample, under the hypothesis.
A low p-value indicates that the observed outcome is unlikely if the hypothesis is true, suggesting the hypothesis may be false.

Key Terms & Definitions

Probability Distribution — Describes the likelihood of all possible outcomes in a random process.
Uniform Distribution — A probability distribution where all outcomes are equally likely.
Normal Distribution — A bell-shaped probability distribution characterized by a mean and standard deviation.
Standard (Normal) Distribution — A normal distribution with mean 0 and standard deviation 1.
Z-score — Number of standard deviations a value is from the mean.
Sampling Distribution — Distribution of a statistic (like the mean) over many samples from a population.
Standard Error — The standard deviation of a sampling distribution.
Hypothesis Testing — A method to test assumptions about population parameters using sample data.
Test Statistic — A standardized value (like a z-score) used in hypothesis testing.
P-value — Probability of observing a statistic as extreme as, or more extreme than, the sample, assuming the hypothesis is true.

Action Items / Next Steps

Practice calculating probabilities using both uniform and normal distributions.
Review how to standardize data and use z-tables for probability calculations.
Study the steps and reasoning behind hypothesis testing, including calculation and interpretation of p-values.