ll(b,a,s2|Y,X) = -½N (ln(2p)+ln(s2)) -½ ln(|W|) -½ (1/s2)(e'W-1e) + Si=1,2,...,N ln(|¶ei/¶Yi|)
The last term is the sum of log-Jacobians from ei to Yi over the entire sample. Since the variance term can be solved as s2 = e'W-1e / N for a given W, the concentrated log-likelihood function is
ll*(b,a,s2|Y,X) = -½N (1+ln(2p)-ln(N)) -½N ln(e'W-1e) -½ ln(|W|) + Si=1,2,...,N ln(|¶ei/¶Yi|)
It is clear that the maximum likelihood estimation is in general not equivalent to the nonlinear least squares unless W = I, the identity matrix, and ¶ei/¶Yi = 1 for each i=1,2,...,N. If W is known and the log-Jacobians vanish, it is the GLS (Generalized Least Squares) problem that minimizes e'W-1e.
Unfortunately, W = W(a) is not known and must be parameterized with a lower dimension a, which in turn is estimated together with the vector of model parameters b. The models with heteroscedastic and/or autocorrelated errors are the special cases of the general regression model in which W(a) is defined more specifically.
For simplicity, consider a regression model e = F(Y,X,b) = Y - f(X,b). Then for each data observation i, ¶ei/¶Yi = 1. We assume further that the heteroscedastic error ei ~ normal(0,si2). The log-likelihood function is
ll(b,si2|Yi,Xi) = -½ [ln(2p) + ln(si2) + ei2/si2]
Summing over a sample of N observations, the total log-likelihood function is written as
ll(b,s12,s22,...,sN2|Y,X) = -½N ln(2p) -½ Si=1,2,...,Nln(si2) -½ Si=1,2,...,N(ei2/si2)
Given the general form of heteroscedasticity, there are too many unknown parameters. For practical purpose, some hypotheses of heteroscedasticity must be assumed:
si2 = s2 hi(a)
where s2 > 0 and hi(a) is indexed by i to indicate that it is a function of Zi. That is hi(a) = h(a|Zi), where Z is a set of independent variables that may or may not be coincide with X. Depending on the form of heteroscedasticity hi(a), denoted by hi for brevity, the log-likelihood function is written as
ll(b,a,s2|Y,X) = -½N (ln(2p) + ln(s2)) -½ Si=1,2,...,Nln(hi) -½(1/s2)Si=1,2,...,N(ei2/hi)
Let ei* = ei / Öhi and substitute out the maximum likelihood estimator of s2 with e*'e*/N, then the concentrated log-likelihood function is
ll*(b,a|Y,X) = -½N (1+ln(2p)-ln(N)) -½ Si=1,2,...,Nln(hi) -½N ln(e*'e*)
The last two log-terms can be combined as:
ll*(b,a|Y,X) = -½N (1+ln(2p)-ln(N)) -½N ln(e**'e**)
where e** = e*Öh, and h = (h1h2...hN)1/N. It becomes a weighted nonlinear least squares probelm with the weighted errors defined by ei** = eiÖ(h/hi).
Consider the following special cases of hi = h(a|Zi) = h(Zia):
The corresponding concentrated log-likelihood function for estimation is
ll*(b,a|Y,X) = -½N (1+ln(2p)-ln(N)) -½ Si=1,2,...,NZia -½N ln(e*'e*)
where ei* = ei / exp(Zia)½ for each observation i=1,2,...,N.
Equivalently,
ll*(b,a|Y,X) = -½N (1+ln(2p)-ln(N)) -½N ln(e**'e**)
ei** = eiÖ(h/hi), h = (h1h2...hN)1/N, and hi = exp(Zia) for each observation i=1,2,...,N.
si2 = s2 Zia
If a = 0, the model is homoscedastic; If a = 2, it is the case (ii).
Y = b0 + b1 X + b2 X2 + e
Find and compare the maximum likelihood estimates based on the following hypotheses of heteroscedasticity (Program and Data):