Coconote
AI notes
AI voice & video notes
Export note
Try for free
Generalized Linear Models Lecture Notes
Jul 22, 2024
Lecture on Generalized Linear Models (GLMs)
Overview
Generalized Linear Models (GLMs)
generalize linear models in two primary ways:
Allowing for different distributions
for the response variables (exponential family distributions).
Fitting a linear predictor to a transformed scale
, suitable for the data distribution.
Exponential Family
Exponential family
: A set of probability distributions, can be defined over various domains.
Focusing on the case where the response variable (Y) and parameter (θ) are real-valued.
Canonical Case
: The density function is in the form of an exponential function involving θ and Y along with other terms.
Properties of the Canonical Exponential Family
Contains distributions like Gaussian, Poisson, and Bernoulli.
Normalization factor
ensures the function integrates to 1.
Function b(θ)
characterizes the distribution, encoding information such as mean and variance.
Derivatives
: Expectation of the derivative of the log-likelihood with respect to θ is zero.
Variances
: Second derivative of the log-likelihood provides variance information.
Likelihood Functions
Log-likelihood for one observation
: Log of the density function.
Log-likelihood for n independent observations
: Sum of individual log-likelihoods, accounting for conditional independence.
Use of expectations and integrals to prove variance properties.
Conditional Distributions and Regression
GLMs focus on the conditional distribution
of Y given X, using a systematic component (linear predictor).
Link function (g)
: Connects the linear predictor (Xβ) to the expected value (μ) of the distribution.
Inverse link function (g⁻¹)
: Maps the linear predictor back to the expected value.
Common Link Functions
Identity link
: Used in linear regression, assumes no transformation is needed.
Log link
: Common for Poisson distributions (μ > 0).
Logit link (log(μ / (1-μ)))
: Converts probabilities (0,1) to real line, suitable for Bernoulli distribution.
Probit link
: Uses the inverse of the normal CDF.
Complementary log-log
: Another option for binary outcomes.
Canonical Links
Canonical link
: Directly maps the systematic component to the canonical parameter θ of the exponential family.
Provides a natural choice for link function when modeling specific distributions.
Examples
Bernoulli distribution
: PMF can be expressed in canonical form; canonical link is the logit function.
Poisson distribution
: PMF expressed using log link.
Gamma distribution
and others have respective canonical links derived from their exponential family representations.
Generalized Linear Models (GLMs) Summary
GLMs
extend linear models to accommodate different distributions and link functions suitable for data properties.
Using
canonical links
simplifies the modeling process by leveraging properties of the exponential family.
Log-likelihood and derivatives provide key insights into distribution characteristics and aid in parameter estimation.
Detailed Example: Poisson Distribution
PMF
:
P(Y=y) = (e^(-µ) * µ^y) / y!
, where µ = θ in canonical form.
Transformation
: Can be restated in canonical form, simplifies expectation and variance calculations.
Beta as the Key Parameter
β parameter
: From β to μ to θ in an organized chain, simplifying the process of modeling and interpreting relationships.
Focus on the log-likelihood as a function of β when maximizing likelihood.
Simplified expressions and avoidance of complex integrals.
Practical Implementation
Iterative approaches like
Newton-Raphson method
and
iteratively reweighted least squares
for parameter estimation.
Upcoming Topics
Review the concepts and practice using
convex optimization
principles for parameter estimation.
Advanced discussions on newer topics post-1975 in statistics and general Q&A.
📄
Full transcript