The generalized linear model (GLM) is a flexible generalization of ordinary least squares regression. OLS restricts the regression coefficients to have a constant effect on the dependent variable. GLM allows for this effect to vary along the range of the explanatory variables. In particular, a nonlinear function links the linear parameterization to the expected value of the random variable.
Let μ = E(Y) and η = Xβ. The basic structure of GLM is the link function g(μ) = η. Therefore, Y = g-1(Xβ) + ε.
GLM is essentially a non-linear model with the linear parameterization in the expected value of Y. To estimate the model, one needs three components:
Y | f(Y) | E(Y) | Var(Y) | |
---|---|---|---|---|
Bernoulli(π) | 0,1 | πY (1-π)1-Y | π | π(1-π) |
Poisson(λ) | 0,1,2,... | exp(-λ) λY/Y! | λ | λ |
Normal(μ,σ) | (-∞,∞) | 1/√(2πσ2) exp[-(Y-μ)2/(2σ2)] | μ | σ2 |
Gamma(λ,ρ) | [0,∞) | λρ/Γ(ρ) exp(-λY) Yρ-1 | ρ/λ | ρ/λ2 |
Exponential(λ) | [0,∞) | λ exp(-λY) | 1/λ | 1/λ2 |
Inverse Normal | ... | |||
Inverse Gamma | ... | |||
... |
The table below lists commonly used link functions and their inverse:
Link | η=g(μ) | μ=g-1(η) |
---|---|---|
Identity | μ | η |
Log | ln(μ) | exp(η) |
Inverse | μ-1 | η-1 |
Inverse-Square | μ-2 | η-0.5 |
Square Root | μ0.5 | η2 |
Logit | ln[μ/(1-μ)] | Λ(η)=exp(η)/[1+exp(η)] |
Probit | Φ-1(μ) | Φ(η) |
Log-log | -ln[-ln(μ)] | exp[-exp(-η)] |
To estimate the coefficients for a GLM model, we use maximum likelihood method.
The model interpretation is typically based on the marginal effect defined by ∂E(Y)/∂X. From the definition of the link function in GLM, g(μ) = η or g(E(Y)) = Xβ, we derive the differentiation ∂g(E(Y))/∂X = g' ∂E(Y)/∂X = β, where g' = ∂g(μ)/∂μ. Therefore ∂E(Y)/∂X = β/g'. For the identity link, g' = 1, or ∂E(Y)/∂X = β.
Family | Link | Log-Likelihood Function: llf(θ) | θ | Notes |
---|---|---|---|---|
Normal(μ,σ) | Identity: μ=Xβ | -Nln(2πσ2)-1/2∑i=1,...,N(Yi-Xiβ)2/σ2 | (β,σ) | This is a linear model |
Normal(μ,σ) | Log: ln(μ)=Xβ | -Nln(2πσ2)-1/2∑i=1,...,N(Yi-exp(Xiβ))2/σ2 | (β,σ) | Not a log-linear model |
Gamma(λ,ρ) | Identity: ρ/λ=Xβ | N[ρ(ln(ρ)-lnΓ(ρ)] +∑i=1,...,N[(ρ-1)ln(Yi)-ln(Xiβ)-ρYi/Xiβ] | (β,ρ) | |
Exponential(λ) | Identity: 1/λ=Xβ | ∑i=1,...,N(-ln(Xiβ)-Yi/Xiβ); | β | |
Exponential(λ) | Inverse: 1/λ=1/Xβ | ∑i=1,...,N(ln(Xiβ)-YiXiβ); | β | |
Poisson(λ) | Identity: λ=Xβ | ∑i=1,...,NXiβ+Yiln(Xiβ)-ln(Yi!) | β | |
Bernoulli(π) | Logit: ln(π/(1-π))=Xβ | ∑i=1,...,NYiln(Λ(Xiβ)) +(1-Yi)ln(1-Λ(Xiβ)) | β | Logit Model |
Bernoulli(π) | Probit: Φ-1(π)=Xβ | ∑i=1,...,NYiln(Φ(Xiβ)) +(1-Yi)ln(1-Φ(Xiβ)) | β | Probit Model |
... | ||||
GRADE = β0 + β1GPA + β2TUCE + β3PSI + ε
The following variables are avaialble in the data file GRADE.TXT:
Using maximum likelihood estimation method to represent and estimate the generlized linear model of Bernoulli or binomial distribution with logit and probit link, respectively. Explain the estimated marginal effects of new teaching method on students' grade performance.