📊

Understanding Chi-Squared Goodness of Fit Test

Nov 24, 2024

Lecture Notes: Chi-Squared Goodness of Fit Test

Revision: Chi-Squared Independence Testing

  • Null Hypothesis (H0): Assumes two variables are independent.
  • Degrees of Freedom: Calculated as (rows - 1) * (columns - 1).
  • Conditions for Rejection of H0:
    • If p-value < significance level.
    • If chi-squared value > critical value.*

Introduction to Chi-Squared Goodness of Fit Test (GOF)

  • Purpose: To test if a sample data matches a population with a specific distribution (often uniform).
  • Null Hypothesis (H0): Data follows a specified distribution (e.g., manufacturer's specifications).
  • Alternate Hypothesis (H1): Data does not follow the specified distribution.
  • Degrees of Freedom: Calculated as n - 1, where n is the number of categories.

Steps for Conducting a Goodness of Fit Test

  1. Data Entry: Enter observed and expected frequencies into lists.
    • Use statistical software or calculators (Inspire: Lists, TID4: Stats Edit).
  2. Degrees of Freedom: Calculate as n - 1.
  3. Conduct Test: Use chi-squared goodness of fit function in calculator.
  4. Decision Rule:
    • Reject H0 if:
      • p-value < significance level
      • chi-squared statistic > critical value (if provided)

Example: Lego Brick Distribution

  • Scenario: Verify if a box of Lego bricks follows the expected color distribution.
    • Expected: 20% white, 30% blue, 10% green, 10% yellow, 20% black, 10% red.
    • Observed frequencies: 82 white, 91 blue, 40 green, 90 yellow, 120 black, 77 red.
  • Calculate Expected Frequencies:
    • Calculate based on total pieces (e.g., 500 pieces, 20% should be 100 white).
  • Hypotheses:
    • H0: Data follows the manufacturer's specifications.
    • Degrees of Freedom: 6 categories - 1 = 5.
  • Test in Calculator:
    • Enter observed and expected values.
    • Use chi-squared goodness of fit test.
    • Results yield p-value = 1.34 x 10^-15.
  • Conclusion:
    • Since p-value < 0.1 (10% significance level), reject H0.
    • Conclude data does not follow the manufacturer's specifications.

Key Takeaways

  • Understanding of the chi-squared goodness of fit test to assess if data matches expected distribution.
  • Importance of calculating and interpreting p-values and degrees of freedom.
  • Application of statistical tools to validate hypotheses in real-world scenarios.