Model misspecification due to violation of the calssical assumption of homoscedasticity (Assunption 4) is considered here. First, we need to review the implications of the normality assumption in small sample or the asymptotic normality property in large sample.
The least squares residual ei = Yi - Xib ~a Normal(0,s2[1-Xi(X'X)-1Xi']), i=1,2,...,N, can be tested using Bera-Jarque Test Statistic (for Asymptotic Normality) as follows:
Compute: | Variance = ∑i=1,2,...,N ei2/N = s2* |
Skewness = ∑i=1,2,...,N (ei3/N)/(s2*)1½ | |
Kurtosis = ∑i=1,2,...,N (ei4/N)/(s2*)2 |
Bera-Jarque Test Statistic for Asympotic Normality is defined as
BJ = N[Skewness2/6 + (Kurtosis-3)2/24] ~ χ2(2).
For example, given a level of significance 0.05 χ20.95 = 5.99 from χ2(2)), if BJ > 5.99, then the null hypothesis of asymptotic normality is rejected. On the other hand, if BJ ≤ 5.99, then normality can not be rejected.
Var(ε|X) = E(εε'|X) = |
|
|
| = σ2 |
|
|
| = σ2Ω |
For convenience, we use the normalization ∑1,2,...,Nωi = N. Therefore σ2 = 1/N ∑1,2,...,Nσi2.
By ignoring heteroscedasticity in the ordinary least squares estimation, the parameter estimators are inefficient, although they are unbiased, consistent, and asymptotically normal distributed.
From the estimated model Y = Xb + e, we have:
b = (X'X)-1X'Y = β + (X'X)-1X'ε
e = Y-Xb = [I-X(X'X)-1X']ε
s2 = e'e/(N-K) = ε[I-X(X'X)-1X']ε.
E(b|X) = β, by Assumption 3.
But, in general E(s2) ≠ σ2
However, it can be shown that if b is consistent then s2 is a consistent estimator
of σ2:
If plim(b) = β, then plim(s2) = σ2.
Var(b|X) | = E[(b-β)(b-β)'|X] | |||||||||||||||||||||||||||||
= σ2(X'X)-1X'ΩX(X'X)-1, by assuming heteroscedasticity. | ||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||
= (X'X)-1{∑i=1,...,Nσi2Xi'Xi}(X'X)-1 | ||||||||||||||||||||||||||||||
= (1/N)(X'X/N)-1{∑i=1,...,Nσi2Xi'Xi/N}(X'X/N)-1 | ||||||||||||||||||||||||||||||
= (σ2/N)(X'X/N)-1{∑i=1,...,NωiXi'Xi/N}(X'X/N)-1 |
Therefore, b ~a Normal(β,(σ2/N)Q-1Q*Q-1),
where Q = plim(X'X/N), and Q* = plim(X'ΩX/N) =
plim(∑i=1,...,NωiXi'Xi/N).
plim∑i=1,...,Nσi2Xi'Xi/N = plim∑i=1,...,Nei2Xi'Xi/N
Therefore the estimated heteroscedasticity-consistent (robust) variance-covariance matrix of b is obtained by
Var(b|X) = (X'X)-1{∑i=1,...,Nei2Xi'Xi}(X'X)-1
Suppose the sample can be divided into two or three groups according to the size of residual variances (which may be in relation with some explanatory variables). Let N1 be the number of sample observations associated with the first group with larger variances. N2 is the number of observations associated with the second group with smaller variances. N1+N2 ≤ N (there may be a third middle group which is eliminated). Let RSS1 and RSS2 are the corresponding sum of squared residuals from the estimated model Y = Xb + e with K parameters for N1 and N2 samples, respectively. Then,
(RSS1/(N1-K)) / (RSS2/(N2-K)) ~ F(N1-K,N2-K).
Suppose the specification of heteroscedasticity depends on a set of exogenous variables Z, which must include a constant term and may include some or all of the explanatory variables X.
Extending from the Breusch-Pagan test for heteroscedasticity, the auxilary regression equation in step 2 of the above Breusch-Pagan test procedure is modified as: e2 = Wδ + υ, where W = [1 Z Z*Z]. That is, if Z = X, in addition to the same set explanatory variables used in the original regression, their quadratic terms (squares and cross products) are included in the auxilary regression equation.
The White LM test statistic is NR2 of the estimated auxilary regression. It follows χ2 distribution with degree of freedom equals to the number of variables in W excluding constant term.
Suppose the symmetric positive definite matrix Ω is known.
There exists P (the "square root" matrix) such that Ω-1 = P'P, or
Ω = P-1P-1'.
Let Y* = PY, X* = PX, and ε* =
Pε, then the transformed linear regression model is:
Y* = X*β + ε*, and
E(ε*|X*) = 0
Var(ε*|X*) = PVar(ε|X)P' =
σ2PΩP' = σ2I
Since the classical assumptions for the transformed linear regression model are satisfied, least squares estimation is applied to minimize sum-of-squared transformed errors ε*'ε*:
b* | = (X*'X*)-1X*'Y* |
= β + (X*'X*)-1X*'ε* | |
= β + (X'Ω-1X)-1X'Ω-1ε |
E(b*|X) = β
Var(b*|X) | = E[(b*-β)(b*-β)'] |
= σ2(X*'X*)-1 | |
= σ2(X'Ω-1X)-1 |
Therefore, b* ~a Normal(β,s*2(X'Ω-1X)-1).
Statistical inferences must be based on the generalized least squares estimator b*, provided that the covariance structure Ω is known. If Ω is not known, then it must be estimated. If Ω can be estimated consistently, in large sample, then the generalized least squares estimator b* is consistent and asymptotic efficient.
Var(ε|X) = σ2Ω = σ2 |
|
|
|
Let wi = 1/√ωi, Xi* = wiXi, and Yi* = wiYi, for i=1,2,...,N. Then the least squares estimation with weighted data matices is a special case of the generalized least sqaures estimation as follows:
b* | = (X*'X*)-1X*'Y* = (X'Ω-1X)-1X'Ω-1Y |
= [∑i=1,2,...,N(Xi*'Xi*)]-1 [∑i=1,2,...,N(Xi*'Yi*)] | |
= [∑i=1,2,...,N(wiXi)'(wiXi)]-1 [∑i=1,2,...,N(wiXi)'(wiYi)] |
Var(b*|X) | = s*2(X*'X*) = s2(X'Ω-1X)-1 |
= s*2[∑i=1,2,...,N(Xi*'Xi*)]-1 | |
= s*2[∑i=1,2,...,N(wiXi)'(wiXi)]-1 |
We note that the interpretation of the estimated model with weighted least squares, Y* = X*b*, is the same as for Y = Xb*.
If the source of heteroscedasticity is found to be one of the exogenous variables, says Xk, then 1/√Xk or 1/Xk may be used to weight the data matrix X and Y, and carry out weighted least sqaures estimation to correct for heteroscedasticity. In general, the heteroscedastic variance is a function of X (in part or all). Consider the following cases:
The last case of multiplicative heteroscedasticity may be expressed in log form as:
ln(σi2) = ln(σ2)
+ α1ln(Xi1)
+ α2ln(Xi2) + ...
+ αKln(XiK)
This log-variance equation can be estimated as:
ln(ei2) = α0
+ α1ln(Xi1)
+ α2ln(Xi2) + ...
+ αKln(XiK) + υi
The exponential transformation of the fitted values exp[ln(ei2)] is used to approximate the heteroscedastic variance σi2. We may apply hypothesis testing for the significance of each αi. If αi = 0 for all i=1,2,...,K, then the null hypthesis of homoscedasticity can not be rejected.