ll(β,α,σ2|Y,X) = -½N (ln(2π)+ln(σ2)) -½ ln(|Ω|) -½ (1/σ2)(ε'Ω-1ε) + ∑i=1,2,...,N ln(|∂εi/∂Yi|)
The last term is the sum of log-Jacobians from εi to Yi over the entire sample. Since the variance term can be solved as σ2 = ε'Ω-1ε / N for a given Ω, the concentrated log-likelihood function is
ll*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½N ln(ε'Ω-1ε) -½ ln(|Ω|) + ∑i=1,2,...,N ln(|∂εi/∂Yi|)
It is clear that the maximum likelihood estimation is in general not equivalent to the nonlinear least squares unless Ω = I, the identity matrix, and ∂εi/∂Yi = 1 for each i=1,2,...,N. If Ω is known and the log-Jacobians vanish, it is the GLS (Generalized Least Squares) problem that minimizes ε'Ω-1ε.
Unfortunately, Ω = Ω(α) is not known and must be parameterized with a lower dimension α, which in turn is estimated together with the vector of model parameters β. The models with heteroscedastic and/or autocorrelated errors are the special cases of the general regression model in which Ω(α) is defined more specifically.
For simplicity, consider a regression model ε = F(Y,X,β) = Y - f(X,β). Then for each data observation i, ∂εi/∂Yi = 1. We assume further that the heteroscedastic error εi ~ normal(0,σi2). The log-likelihood function is
ll(β,σi2|Yi,Xi) = -½ [ln(2π) + ln(σi2) + εi2/σi2]
Summing over a sample of N observations, the total log-likelihood function is written as
ll(β,σ12,σ22,...,σN2|Y,X) = -½N ln(2π) -½ ∑i=1,2,...,Nln(σi2) -½ ∑i=1,2,...,N(εi2/σi2)
Given the general form of heteroscedasticity, there are too many unknown parameters. For practical purpose, some hypotheses of heteroscedasticity must be assumed:
σi2 = σ2 hi(α)
where σ2 > 0 and hi(α) is indexed by i to indicate that it is a function of Zi. That is hi(α) = h(α|Zi), where Z is a set of independent variables that may or may not be coincide with X. Depending on the form of heteroscedasticity hi(α), denoted by hi for brevity, the log-likelihood function is written as
ll(β,α,σ2|Y,X) = -½N (ln(2π) + ln(σ2)) -½ ∑i=1,2,...,Nln(hi) -½(1/σ2)∑i=1,2,...,N(εi2/hi)
Let εi* = εi / √hi and substitute out the maximum likelihood estimator of σ2 with ε*'ε*/N, then the concentrated log-likelihood function is
ll*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½ ∑i=1,2,...,Nln(hi) -½N ln(ε*'ε*)
The last two log-terms can be combined as:
ll*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½N ln(ε**'ε**)
where ε** = ε*√h, and h = (h1h2...hN)1/N. It becomes a weighted nonlinear least squares probelm with the weighted errors defined by εi** = εi√(h/hi).
Consider the following special cases of hi = h(α|Zi) = h(Ziα):
The corresponding concentrated log-likelihood function for estimation is
ll*(β,α|Y,X) = -½N (1+ln(2π)-ln(N)) -½ ∑i=1,2,...,NZiα -½N ln(ε*'ε*)
where εi* = εi / exp(Ziα)½ for each observation i=1,2,...,N.
Equivalently,
ll*(βα|Y,X) = -½N (1+ln(2π)-ln(N)) -½N ln(ε**'ε**)
εi** = εi√(h/hi), h = (h1h2...hN)1/N, and hi = exp(Ziα) for each observation i=1,2,...,N.
σi2 = σ2 Ziα
If α = 0, the model is homoscedastic; If α = 2, it is the case (ii).
Y = β0 + β1 X + β2 X2 + ε
Find and compare the maximum likelihood estimates based on the following hypotheses of heteroscedasticity: