📊

Categorical Data Analysis Basics

Jul 21, 2025

Overview

This lecture introduces basic categorical data analysis, focusing on understanding and converting between percentages and proportions, especially as used in statistics.

Types of Data

  • Data can be categorized as either categorical (descriptive, non-numeric) or quantitative (numeric).
  • Categorical data describes traits like location, school, or other identifiers.

Percentages and Proportions

  • A percentage represents a part out of 100; "percent" means "per hundred."
  • A proportion is the decimal form of a percentage, used widely in statistics.
  • In statistics, "proportion" does not refer to the algebraic meaning involving cross-multiplication.

Converting Percentages and Proportions

  • To convert a percentage to a proportion, divide by 100 (move the decimal two places left).
    • Example: 33.7% → 0.337
    • Example: 100% → 1
    • Example: 6% → 0.06
  • To convert a proportion to a percentage, multiply by 100 (move the decimal two places right), then add the % sign.
    • Example: 0.018 → 1.8%
    • Example: 0.873 → 87.3%
    • Example: 1 → 100%
  • Leading zeros before the decimal are standard practice (e.g., 0.06).

Scientific Notation and Small Proportions

  • Very small proportions may be represented in scientific notation (e.g., 2.3 × 10^-5).
  • Move the decimal left by the exponent to get the standard decimal form (e.g., 0.000023).
  • Convert to percentage by multiplying by 100 (e.g., 0.000023 → 0.0023%).

Categorical Data Analysis Basics

  • First steps in analyzing categorical data include finding counts or frequencies (number of successes), often denoted by X.
  • The total sample size is denoted by n, representing the total number of observations, objects, or trials.

Key Terms & Definitions

  • Categorical Data — data described by words or categories, not numbers.
  • Percentage — a value out of 100, denoted by the % symbol.
  • Proportion — the decimal equivalent of a percentage used in statistics.
  • Frequency (Count) — the number of times a particular category appears.
  • Sample Size (n) — the total number of observations in the data set.
  • Scientific Notation — a way to write very small or large numbers using powers of ten.

Action Items / Next Steps

  • Practice converting between percentages and proportions.
  • Be able to interpret and convert numbers in scientific notation.
  • Review definitions of key terms for categorical data analysis.