📊

Understanding Conditional Independence Concepts

Sep 16, 2024

Lecture Notes on Conditional Independence

Independent Variables

  • Example: Sister and Brother's Ethnicity
    • Probability that Sister is White given Brother is White: 90%
    • Probability that Sister is White given Brother is Black: 10%
    • Conclusion: Sister (S) and Brother (B) are not independent.
      • If they were independent, the conditional probabilities would be equal.

Conditional Independence

  • Knowing Parents' Ethnicity:
    • If the parent's ethnicity is known (e.g., both parents are White), the probability of Sister being White remains high (95%) regardless of Brother's ethnicity.
    • Conclusion: S is conditionally independent of B given the parents.
      • Notation: S independent of B given Parents.

COVID Example: Fever and PCR Test Result

  • If someone has a fever, it affects the probability of their PCR test result.
    • Probability of Fever given Negative PCR Test: 20%
    • Probability of Fever given Positive PCR Test: 70%
    • Conclusion: Fever (F) and PCR Test Result (P) are not independent.

Introducing Infection Status

  • If we know a person is infected with COVID (C = 1):
    • Probability of having Fever given C = 1: 60%
    • Probability of Positive Test Result given C = 1: 90%
    • Knowing about Fever does not change the probability of Test Result given C.
    • Conclusion: F and P are conditionally independent given C.

Definitions and Notations

  • Conditional independence: X is conditionally independent of Y given Z if:
    [ p(X | Y, Z) = p(X | Z) ]
  • Independence in general:
    • If Z is empty, X and Y are mutually independent if:
      [ p(X | Y) = p(X) ]
  • The set of all independencies is denoted by I(P).

Proposition on Independence

  • X and Y are independent given Z if: [ p(X, Y | Z) = p(X | Z) imes p(Y | Z) ]

Implications of Conditional Independence

  • Using the example of Fever and PCR Test, we established that:
    • If we know the joint probability distribution of F, P, and C, we can simplify it using conditional independence.

Factorization of Joint Distribution

  1. Factorize the joint distribution into components.
  2. Use chain rule for random variables to express the joint probability.
    • [ p(X_1, X_2, , ... , X_n) = p(X_1) imes p(X_2 | X_1) imes ... imes p(X_n | X_{n-1}, ... , X_1) ]
  3. Identify conditional independencies to simplify terms._

Analyzing Conditional Independence

  • Example of Joint Distribution involving multiple variables:
    • Conditional independence can reduce the number of parameters needed to describe a joint distribution.
  • Variables can be reordered for relevant conditional dependencies.

Learning Conditional Independence

  • Finding conditional independencies from data:
    • Statistical tests for identifying I(P).
  • Bayesian Networks provide a visualization tool for understanding these dependencies and independencies.

Summary & Goals

  • The main goal is to derive the joint probability distribution from data, which requires finding the correct factorization.
  • Each factorization affects the number of parameters required.

Next Steps

  • Explore how to systematically factorize joint distributions given identified conditional independencies.