F(z,b) = e
F(z,b) defines the functional form of the model, where z = [y x] is the data matrix which includes both dependent (endogenous) y and independent (exogenous) x variables, b is the vector of unknown parameters and e is the model error. A typical nonlinear model in econometrics takes a separable form (between y and x) like this:
F(z,b) = F(y,x,b) = y - f(x,b)
S(b|y,x) = e'e = F(y,x,b)' F(y,x,b)
Least squares estimates of the parameters are computed from the first-order condition for minimization (zero gradient):
¶S/¶b = 2e'(¶e/¶b) = 0.
Finally, the following Hessian matrix must be checked for the positive definiteness:
¶2S/¶b¶b' = 2[(¶e/¶b)'(¶e/¶b) + Si=1,2,...,Nei (¶2ei/¶b'¶b)]
In addition to the parameter estimates, the estimated variance-covariance matrix of the parameters is derived from the positive definite Hessian matrix as follows:
Var(b) = s2 [½E(¶2S/¶b¶b)']-1 = s2 [(¶e/¶b)'(¶e/¶b)]-1
where the estimated model variance is s2 = e'e/N, and N is the sample size used for estimation.
If there are inequality parameter constraints (e.g., non-negativity) expressed in terms of a continuous transformation b = f(a) where a is an unconstrained parameter vector. Then from the estimator of a and Var(a), we have
b = f(a)
Var(b) =
(¶f/¶a)
[Var(a)]
(¶f/¶a)'
S*(b|y,x) = e*'e*
ll(b,s2|yi,xi) = -½ [ln(2p)+ln(s2) + ei2/s2] + ln(Ji(b))
where ei = F(yi,xi,b), and Ji(b) = |¶ei/¶yi| is the Jacobian of transformation from ei to yi. The model is estimated by maximizing the sum of log-likelihood over a sample of N observations as follows:
ll(b,s2|y,x) = -½N [ln(2p)+ln(s2)] -½ (e'e/s2) + Si=1,2,...,Nln(Ji(b))
The solution is obtained from the system of first-order condition as follows:
¶ll/¶b =
- e'/s2
(¶e/¶b) = 0.
¶ll/¶s2 =
- N/(2s2)
- e'e/(2s4) = 0.
Usually the maximum likelihood estimation is performed by substituting out the asymptotic variance estimate s2. That is, s2 = e'e/N. Then the following concentrated log-likelihood function is maximized to find the parameter estimates b:
ll*(b|y,x) = -½N [1+ln(2p)-ln(N)] -½N ln(e'e) + Si=1,2,...,Nln(Ji(b))
Let e* = e/[(J1...JN)1/N]. Then the last two terms of the above concentrated log-likelihood function can be combined and the function is re-written as
ll*(b|y,x) = -½N [1+ln(2p)-ln(N)] -½N ln(e*'e*)
Therefore, maximizing the concentrated log-likelihood function ll*(b|y,x) is equivalent to minimizing the sum of squared weighted errors:
S*(b|y,x) = e*'e*
where e* = we, with the weight w = 1/[(J1...JN)1/N] (inverse of the geometric mean of Jacobians) applied to each observation of the error terms e.
Solving from the first-order condition or zero-gradient condition:
¶ll*/¶b = -½N ¶ln(S*)/¶b = -½(N/S*)(¶S*/¶b) = -(N/S*)[e*'(¶e*/¶b)] = 0,
the solution must be checked for the negative definiteness of the Hessian matrix (the second-order condition):
¶2ll*/¶b¶b' | = ½N ¶2ln(S*)/¶b¶b' |
= ½(N/S*)[(1/S*)(¶S*/¶b)'(¶S*/¶b) -(¶2S*/¶b¶b')] |
Since ¶S*/¶b = 0 from the first-order condition for the maximum likelihood solution, the corresponding negative definite Hessian matrix is simply
¶2ll*/¶b¶b' | = -(N/S*)[½(¶2S*/¶b¶b')] |
= -(N/S*)[(¶e*/¶b)'(¶e*/¶b) + Si=1,2,...,Nei* (¶2ei*/¶b¶b')] |
Thus the estimated variance-covariance matrix for the maximum log-likelihood estimator is:
Var(b) | = [-E(¶2ll*/¶b¶b')]-1 |
= s2*[½E(¶2S*/¶b¶b')]-1 = s2*[(¶e*/¶b)'(¶e*/¶b)]-1 |
Given that s2* = S*/N. It is ready to see that the solution derived from the maximum log-likelihood procedure is identical as that derived from the model formulated as a weighted least squares problem, in which the weight is defined as the inverse of the geometric mean of Jacobians or
1/[(J1...JN)1/N].
Further, as in the case of nonlinear least squares, if there are inequality parameter constraints (e.g., non-negativity) expressed in terms of a continuous transformation b = f(a) where a is an unconstrained parameter vector. Then from the estimator of a and Var(a), we have
b = f(a)
Var(b) =
(¶f/¶a)
[Var(a)]
(¶f/¶a)'
ll*(b|y,x) = -½N [1+ln(2p)-ln(N)] -½N ln(e'e)
This is exactly the special case of classical nonlinear model in which e = F(y,x,b) = y - f(x,b). For this special case, maximizing the concentrated log-likelihood function ll*(b|y,x) is the same as minimizing the sum of squared errors S(b|y,x).
ln(Q) = b1 + b4ln(b2Lb3 + (1-b2)Kb3) + e
Based on Zellner and Revankar [1970], the CES production function may be generalized to consider variable rate of returns to scale as follows:
ln(Q) + q Q = b1 + b4ln(b2Lb3 + (1-b2)Kb3) + e
Modify and estimate the Generalized CES production function (Program).
e = F(y,x,b) ~
normal(0,s2I).
Thus the estimated least squares or maximum likelihood parameters b*
is normally distributed:
b* ~ normal(b, Var(b*))
where the estimated variance-covariance matrix is
Statistical Inferences in Nonlinear Models
The fundamental assumption of statistical inferences is the normal probability distribution
of the model error:
Var(b*) | = [-E(¶2ll(b*)/¶b¶b')]-1 |
= s2* [½ E(¶2S(b*)/¶b¶b')]-1 |
and the estimated asymptotic variance of the model is s2* = S(b*)/N.
Confidence location of the true parameter vector b is derived from the estimated b* based on the following familiar asymptotic F statistic:
F = (S(b)-S(b*))/J/s2*
where J is the degrees of freedom associated with the testing hypotheses. By approximating the sum of squares function S(b) at b* up to the second order and set ¶S(b*)/¶b = 0,
S(b) - S(b*) = ½ (b-b*)' [¶2S(b*)/¶b¶b'] (b-b*)
Therefore, the test statistic for testing b = b*:
JF = (b-b*)' [Var(b*)]-1 (b-b*)
follows a Chi-Square distribution with J degrees of freedom.
c(b) = 0.
If the constraints were true, without estimating the constrained model, the unconstrained parameter estimator b* is expected to satisfy the constraint equation closely. That is, c(b*) = 0. The test statistic
W = c(b*)'[Var(c(b*)]-1c(b*)
has a Chi-square distribution with J degrees of freedom. With the first-order linear approximation of the constraint function c(b) at b*,
W = c(b*)' {(¶c(b*)/¶b) [Var(b*)] (¶c(b*)/¶b)'}-1 c(b*)
Note that this test statistic does not require the computation of the constrained parameter estimator.
The test statistic is written as:
LM = (¶ll(b*)/¶b) [Var(b*)] (¶ll(b*)/¶b)'
The estimated variance-covariance matrix of the constrained estimator b* is computed as follows:
Var(b*) = H-1 [I - G'(G H-1G')-1GH-1]
where H = [-¶ll(b*)2/¶b¶b'] and G = [¶c(b*)/¶b].
LM test statistic is easily approximated with the following formula:
LM = {[e(b*)'(¶e(b*)/¶b)] [(¶e(b*)/¶b)'(¶e(b*)/¶b)]-1 [e(b*)'(¶e(b*)/¶b)]'}/(e(b*)'e(b*)/N)
Note that the maximum likelihood estimates of errors e(b*) may be properly weighted, and this test statistic is based on the constrained parameters alone.
LR = -2(ll(b*)-ll(b*))
follows a Chi-square distribution with J degrees of freedom, in which there are J constraints in the equation c(b) = 0. In terms of sum of squares, it is
LR = N ln(S(b*)/S(b*)).
ln(Q) = b1 + b4ln(b2Lb3 + (1-b2)Kb3) + e
Using Wald test, Largrangian multiplier test, and likelihood ratio test to verify the nonlinear equality constraint: b4 = 1/b3. (Program)