Overview
This lecture explains how to create a frequency distribution table to summarize large data sets, illustrated with a 25-person survey on weekly exercise hours.
Purpose of Frequency Distribution Tables
- Frequency distribution tables organize large data sets into summarized, easy-to-read groups (classes or bins).
- They help condense many observations into fewer, meaningful rows for analysis.
Steps to Construct a Frequency Distribution Table
- Step 1: Determine the class width by dividing the data range by the number of classes and rounding up to a convenient number.
- Step 2: Decide on lower and upper class limits; start at the minimum value and increment by the class width.
- Lower class limits begin with the smallest data value and increase by class width for each subsequent class.
- Upper class limits are one unit less than the next lower class limit, or just before the next class for data with decimals.
- Step 3: Count the number of data points (frequency) within each class.
Example: Exercise Hours Data
- Data range: lowest value is 0, highest is 13.
- Number of classes: 5.
- Class width: (13-0) รท 5 = 2.6, rounded up to 3.
- Lower class limits: 0, 3, 6, 9, 12.
- Upper class limits: 2, 5, 8, 11, 14.
- Frequency counts for each class: 0โ2 (8), 3โ5 (5), 6โ8 (6), 9โ11 (5), 12โ14 (1).
- Check that total frequencies add to the total number of observations (25).
Using the Frequency Table
- Questions like "how many people exercise less than 5 hours?" can be answered by adding relevant frequencies.
Key Terms & Definitions
- Frequency distribution table โ A table that summarizes data by grouping values into classes and counting their frequencies.
- Class/bin โ A group or interval into which data points are sorted.
- Class width โ The size of each class, usually (range รท number of classes), rounded up.
- Class limits โ The lowest and highest values that fit in each class.
Action Items / Next Steps
- Review how to create class boundaries and class midpoints in the next lesson.
- Practice constructing frequency distribution tables with different data sets.