⚖️

Correlation vs. Causation

Sep 13, 2025

Overview

This lecture explains the critical difference between correlation and causation, using ice cream-related examples to highlight common misunderstandings and their potential consequences.

Dangers of Misunderstanding Correlation and Causation

  • Many people believe selling more ice cream leads to obesity, higher crime, drowning deaths, and forest fires.
  • These observed relationships are not proof that ice cream causes these events.

Understanding Correlation

  • Correlation means two variables are related but one does not cause the other.
  • Often, a third factor influences both variables (e.g., hot weather increases both ice cream sales and swimming).
  • Large datasets can reveal many coincidental correlations with no logical link.
  • Example: margarine sales and divorce rates in Maine are correlated but unrelated.

The Role of Causation

  • Causation is when one variable directly causes a change in another.
  • To claim causation, a strong, clear cause-and-effect relationship must be demonstrated.
  • The pharmaceutical industry uses clinical trials and control groups to test for causation before approving drugs.

Challenges in Proving Causation

  • Ice cream and obesity: Data shows people gain weight in winter when ice cream sales are low, contradicting the idea that ice cream causes obesity.
  • Scientific studies investigate specific ingredients like fructose to understand their effects, but results can be complicated (fructose is found in both ice cream and fruit).

Data Dredging

  • Data dredging is the practice of searching massive data for patterns, sometimes finding misleading or coincidental correlations.

Key Takeaways

  • Correlation does not equal causation.
  • Finding a correlation is easy; proving causation requires rigorous testing and evidence.
  • Be skeptical of simplistic claims that "X causes Y" without evidence of causation.

Key Terms & Definitions

  • Correlation — A relationship where two variables move together but one does not necessarily cause the other.
  • Causation — A relationship where one variable directly causes a change in another variable.
  • Data Dredging — Searching large datasets for any statistical correlations, often leading to misleading results.

Action Items / Next Steps

  • Be critical of claims linking two events; assess if there is actual evidence of causation.
  • Review assigned readings on correlation and causation for deeper understanding.