Overview
This lecture covered how to describe categorical data using frequency and relative frequency tables, including manual methods and using Google Sheets.
Review of Previous Concepts
- Statistics has two main branches: descriptive statistics and inferential statistics.
- A population is the full set of items; a sample is a subset from the population.
- Data discussed here is structured, arranged in tables (variables = columns, observations = rows).
- Data types include categorical (labels or categories) and numerical (numbers).
- Cross-sectional data is collected at one point in time; time series data is collected over time.
- Measurement scales: categorical data (nominal, ordinal); numerical data (interval, ratio).
Describing Categorical Data
- Categorical data is summarized using frequency distributions (tables showing category counts).
- A frequency table lists each distinct category, tally marks for occurrences, and the frequency (count).
- Steps to construct a manual frequency table:
- List distinct category values.
- For each observation, mark a tally for its category.
- Count tallies for each category to get frequencies.
Frequency Tables in Google Sheets
- Enter the data in a column with a header (e.g., "Category").
- Highlight the data and select "Data" > "Pivot Table."
- In the pivot table editor, set the row as the category variable and values as the count.
- The resulting table displays categories and their counts.
Relative Frequency
- Relative frequency = frequency of category / total number of observations.
- Add a column to divide each frequency by the total, yielding values between 0 and 1.
- The sum of all relative frequencies equals 1.
- Relative frequency tables allow comparison between datasets of different sizes.
Key Terms & Definitions
- Descriptive statistics — summarizing and describing data features.
- Inferential statistics — making predictions or inferences about a population from a sample.
- Structured data — data in a tabular (row/column) format.
- Categorical data — non-numeric data sorted by category (nominal or ordinal).
- Frequency — count of occurrences for each category.
- Frequency table — table showing each category and its frequency.
- Relative frequency — proportion of each category relative to the total observations.
Action Items / Next Steps
- Practice constructing manual frequency tables with sample categorical data.
- Create frequency and relative frequency tables using Google Sheets.
- Complete exercises calculating relative frequencies for given data sets.