Topic 4b

Heteroscedastic Regression Models

Readings and References:

W. H. Greene, Econometric Analysis, 7th Ed., Chapter 9: The Generalized Regression Model and Heteroscedasticity, Prentice-Hall, 2011.
A. C. Harvey, "Estimating Regression Models with Multiplicative Heteroscedasticity," Econometrics, 1976, 461-465. (Paper)

Consider a general regression model F(Y,X,β) = ε ~ normal(0,Σ). Let the covariance matrix Σ = σ²Ω(α), then the corresponding log-likelihood function is

The last term is the sum of log-Jacobians from ε_i to Y_i over the entire sample. Since the variance term can be solved as σ² = ε'Ω^-1ε / N for a given Ω, the concentrated log-likelihood function is

It is clear that the maximum likelihood estimation is in general not equivalent to the nonlinear least squares unless Ω = I, the identity matrix, and ∂ε_i/∂Y_i = 1 for each i=1,2,...,N. If Ω is known and the log-Jacobians vanish, it is the GLS (Generalized Least Squares) problem that minimizes ε'Ω^-1ε.

Unfortunately, Ω = Ω(α) is not known and must be parameterized with a lower dimension α, which in turn is estimated together with the vector of model parameters β. The models with heteroscedastic and/or autocorrelated errors are the special cases of the general regression model in which Ω(α) is defined more specifically.

For simplicity, consider a regression model ε = F(Y,X,β) = Y - f(X,β). Then for each data observation i, ∂ε_i/∂Y_i = 1. We assume further that the heteroscedastic error ε_i ~ normal(0,σ_i²). The log-likelihood function is

ll(β,σ_i²|Y_i,X_i) = -½ [ln(2π) + ln(σ_i²) + ε_i²/σ_i²]

Summing over a sample of N observations, the total log-likelihood function is written as

ll(β,σ₁²,σ₂²,...,σ_N²|Y,X) = -½N ln(2π) -½ ∑_i=1,2,...,Nln(σ_i²) -½ ∑_i=1,2,...,N(ε_i²/σ_i²)

Given the general form of heteroscedasticity, there are too many unknown parameters. For practical purpose, some hypotheses of heteroscedasticity must be assumed:

σ_i² = σ² h_i(α)

where σ² > 0 and h_i(α) is indexed by i to indicate that it is a function of Z_i. That is h_i(α) = h(α|Z_i), where Z is a set of independent variables that may or may not be coincide with X. Depending on the form of heteroscedasticity h_i(α), denoted by h_i for brevity, the log-likelihood function is written as

ll(β,α,σ²|Y,X) = -½N (ln(2π) + ln(σ²)) -½ ∑_i=1,2,...,Nln(h_i) -½(1/σ²)∑_i=1,2,...,N(ε_i²/h_i)

Let ε_i^* = ε_i / √h_i and substitute out the maximum likelihood estimator of σ² with ε^*'ε^*/N, then the concentrated log-likelihood function is

ll^*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½ ∑_i=1,2,...,Nln(h_i) -½N ln(ε^*'ε^*)

The last two log-terms can be combined as:

ll^*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½N ln(ε^**'ε^**)

where ε^** = ε^*√h, and h = (h₁h₂...h_N)^1/N. It becomes a weighted nonlinear least squares probelm with the weighted errors defined by ε_i^** = ε_i√(h/h_i).

Consider the following special cases of h_i = h(α|Z_i) = h(Z_iα):

σ_i² = σ²(Z_iα), Z_iα > 0
σ_i² = σ²(Z_iα)²
Exponential Heteroscedasticity: σ_i² = σ²exp(Z_iα)
The corresponding concentrated log-likelihood function for estimation is
ll^*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½ ∑_i=1,2,...,NZ_iα -½N ln(ε^*'ε^*)
where ε_i^* = ε_i / exp(Z_iα)^½ for each observation i=1,2,...,N.
Equivalently,
ll^*(βα|Y,X) = -½N (1+ln(2π)-ln(N)) -½N ln(ε^**'ε^**)
ε_i^** = ε_i√(h/h_i), h = (h₁h₂...h_N)^1/N, and h_i = exp(Z_iα) for each observation i=1,2,...,N.
Multiplicative Heteroscedasticity: σ_i² = σ²Π_m=1,2,...,M Z_im^α_m, where M is the number of variables in Z_i. This is equivalent to the exponential case if the variables in Z are logs. That is, σ_i² = σ²exp[ln(Z_i)α]. A special case, with a single variable, is
σ_i² = σ² Z_i^α
If α = 0, the model is homoscedastic; If α = 2, it is the case (ii).

Example

Given the data of per capita expenditure on public schools and per capita income from Greene's Table 12.1 (1997, p. 541) or GREENE.TXT, consider the following somewhat heteroscedastic relationship of public school spending (Y) and income (X):

Y = β₀ + β₁ X + β₂ X² + ε

Find and compare the maximum likelihood estimates based on the following hypotheses of heteroscedasticity:

σ_i² = σ² X_i²
σ_i² = σ² X_i^α
σ_i² = σ² exp(αX_i)

Note that (1) is a special case of (2) in which α = 2; and (2) is equivalent to (3) if X is expressed in log form.