Inference with Generalized Linear Models (GLMs)

Preliminaries

Response variable (y:i): Estimator of the expected value (μ_hat) for observation I.
Two Extreme Positions:
- Null Model: Each response gets the same estimate (global average, denoted as Y̅). This model is overly simple.
- Saturated/Full Model: Each response estimated by actual values, leading to overfitting.

Likelihood Ratio Test Statistic: Balances regression model against full model.
Log Likelihood: Uses parameter θ and mean μ_I for response/observation I, with substitution yielding the log likelihood in terms of Wμ_I.

Log Likelihood: L evaluated in μ_hat; compares this with full model.
Likelihood Ratio Test Statistic: Compares log likelihood of regression model against saturated model. Formula: -2 * log(L(μ_hat) / L(saturated model)).
Scaled Deviance Statistic: Deviance divided by dispersion parameter (Φ).*

Dispersion Parameter: Fixed, equals 1. Deviance and scaled deviance are the same.
General Setting: Dispersion parameter is unknown, requires estimation. Scaled deviance reflects this.

Log Likelihood: Function of mean μ_I and variance parameter (σ²).
Mean (μ_I): Expressed as Xᵢᵀ * β, using identity link function and dispersion parameter σ².
Scaled Deviance Statistic: Sum of squared differences between response variable (Y) and estimate (μ_hat_I), divided by σ².
Weighted Sum of Squares: Add weight wᵢ for observation-specific weighting.
Residual Sum of Squares: Familiar representation of scaled deviants.*

Definition: Sum of squared differences between response variable (Y) and mean (μ_hat), divided by variance function evaluated in (μ_hat).
Example (Normal Linear Regression): Variance is fixed (σ²), Pearson statistic reduces to sum of squared residuals.
Distributional Results: Both deviance and Pearson statistics have asymptotic distribution ~ Φ * χ² with degrees of freedom (n - p + 1).*

Comparison: Null hypothesis model vs alternative hypothesis model.
Drop in Deviance Statistic: Difference between scaled deviance from reduced model (null) and larger model (alternative).
Likelihood Ratio Test: Compare log likelihoods under null and alternative hypotheses.
Decision Making: Compare observed drop in deviance against χ² quantile to accept/reject null hypothesis.
Dispersion Parameter Known: Directly compare with χ² distribution.
Dispersion Parameter Unknown: Estimate with alternative model deviance scaled by degrees of freedom.
F Test Statistic: Ratio of drop in deviance (divided by Q) and estimator of Φ. Compare observed F test value with F distribution quantile.

Assessment: Evaluate model fit, variance assumption, link function choice, covariates inclusion, etc.
Pearson Residuals: Difference between response (Y) and fitted value (μ_hat) scaled by √variance function evaluated in fitted value.
Deviance Residuals: Sum of squares equals deviance of the model under investigation.
Software: Pearson and deviance residuals extractable in statistical software post-calibration of GLM.