Causality: Understanding the underlying data generating process is crucial, especially for answering causal questions rather than just predictive ones.
Correlation vs. Causation: Commonly heard that "correlation does not imply causation."
Predictive vs. Causal Questions: Predictive models may not answer causal questions relevant in fields like healthcare.
Importance of Causality in Healthcare
Predictive models help in early detection, e.g., diabetes, but the ultimate aim is to prevent health issues.
Example: Gastric bypass surgery appearing as a negative weight in predictive models raises causal questions.
Diagnosis and treatment decisions are inherently causal rather than purely predictive.
Causal Inference
Potential Outcomes Framework: Focuses on counterfactuals (what would happen under different scenarios).
Causal Graphs: Visual representations showing causal relationships; important for causal inference.
Examples of Causal Questions
Does gastric bypass prevent diabetes?
How do treatment decisions influence patient outcomes?
Does smoking cause lung cancer? (Can't be tested ethically with randomized trials)
Methods and Assumptions
Propensity Scores: Used to estimate the probability of treatment to ensure overlap between treated and untreated groups.
No Unobserved Confounding: Assumes all relevant factors affecting treatment and outcome are observed.
Common Support (Overlap): Ensures some probability for each treatment across different subpopulations.
Adjustment Formula
Adjustment Formula: Used to estimate causal effects from observational data.
Requires assumptions of no unobserved confounding and common support.
Covariate Adjustment: Learning a function to predict outcomes based on both covariates and treatment.
Challenges and Considerations
High dimensional data complicates traditional statistical approaches.
Machine learning reduction must focus on inputs, interventions, and outcomes.
Potential biases need to be addressed, especially in high-dimensional settings.
QA and Discussion
Concerns about the quality of function learning in machine learning models for causal inference.
Importance of having valid assumptions and functional forms for accurate causal inferences.
Conclusion
Emphasis on understanding causal relationships in data for making informed decisions, particularly in healthcare.
Recognizing the limitations of predictive models and the importance of incorporating causal inference methods.