Systems of Regression Equations
Table of Contents
Consider a system of regression quations with N observations, indexed by i:
Yi = Xiβi + εi
(i=1,2,...,G)
where Yi = [Yi1,Yi2,...,YiN]',
Xi = [Xi1,Xi2,...,XiN]', and
εi = [εi1,εi2,...,εiN]'.
The model satisfies the following assumptions:
E(εi) = 0Nx1
Cov(Xi,εi) =
E(Xi'εi) = 0Kx1
Cov(εi,εj) =
E(εiεj') =
σijΩij
- σii = σ2i > 0.
- σij = 0, if no cross equation correlation (i≠j).
- Ωij = INxN, if no serial correlation.
The stacked (or pooled) model is written as
Y = Xβ + ε, or
|
|
|
=
|
|
X1 | 0 | .. | 0 |
0 | X2 | .. | 0 |
: | : | : | : |
0 | 0 | .. | XG |
|
|
|
|
|
+
|
|
|
|
This model is also known as the Seemingly Unrelated Regression Equations.
Special Cases
- Common Parameters: βi = β for all i, or
Yi = Xiβ + εi
- More restricted cases may include common regressors (Xi = X for all i)
and restrictions on part or all of the paramters.
Consider the general model Y = Xβ + ε
satisfying the following classical assumptions (no serial correlation, and
homoscedascity for each equation):
E(ε) = 0NGx1
E(X'ε) = 0Kx1
Var(ε) = E(εε') = VNGxNG =
ΣGxG⊗INxN, where
Σ =
|
|
σ11 | σ12 | .. | σ1G |
σ21 | σ22 | .. | σ2G |
: | : | : | : |
σG1 | σG2 | .. | σGG |
|
|
Notice that contemporary correlation across equations is assumed although there is no serial correlation
for each equation. GLS (Generalized Least Squares) estimation of the model parameters β follows:
b = (X'V-1X)-1X'V-1Y
Var(b) = (X'V-1X)-1
The estimator of the elements σij of V = Σ⊗I
is obtained from sij = ei'ej/N,
where ei = Yi - Xib is the residual vector
for equation i obtained from the OLS estimation (assuming no cross equation correlation).
The above GLS estimation may be iterated to update the residuals e and variance-covariance
matrix V.
Similar to the concept of random effects model for panel data analysis,
for each equation i=1,2,...,G, the model (with K random coefficients)
may be expressed as follows:
Yi = Xiβi + εi
βi = β + υi
We note that not only the intercept but also the slope parameters are random across equations.
This model generalizes from the system of regression equations with common (but random) parameters.
The assumptions of the model are:
- E(εi) = 0Nx1
- Var(εi) = E(εiεi') =
σi2INxN
- Cov(εi,εj) = 0NxN, i≠j
and
- E(υi) = 0Kx1
- Var(υi) = E(υiυi') =
ΓKxK (independent of i)
- Cov(υi,υj) = 0KxK, i≠j
- Cov(υi,εi) = 0Kx1
The model for estimation is
Yi = Xiβ +
(Xiυi + εi), or
Yi = Xiβ + ωi
where ωi = Xiυi +
εi, and
- E(ωi) = 0Nx1
- Var(ωi) = E(ωiωi') = Πi
= E(Xiυυ'Xi'+Xiυiε+εiυi'Xi+εiεi')
= σi2INxN + XiΓXi'
Then the stacked model is
Y = Xβ + ω
where ω =
[ω1,...,ωG]', and
E(ω) = 0GNx1
Var(ω) = E(ωω') = V =
|
|
Π1 | 0 | .. | 0 |
0 | Π2 | .. | 0 |
: | : | : | : |
0 | 0 | .. | ΠG |
|
|
GLS is used to estimate the model. That is,
b* = (X'V-1X)-1X'V-1Y
Var(b*) = (X'V-1X)-1
A special case of random coefficient model is to assume fixed slop coefficents but the intercept is random:
Yij = Xijβ + ωij
ωij = υj + εij
where i = 1,2,...G (equations) and j = 1,2,...,N (observations). For each observation j,
υj is the random component of the intercept. We assume
- E(υj) = 0, Var(υj) = σu2
- E(εij) = 0, Var(εij) = σe2
- Cov(υj,εij) = 0
The stacked form of the model is written as
Y = Xβ + ω
ω = ι⊗υ + ε
where ι is the unit vector of size G (the number of equations). Then
Var(ω) = Var(ι⊗υ+ε) =
ιι'⊗Var(υ)+Var(ε) =
ιι'⊗[σu2IN]+σe2ING = V
The model can be estimated with GLS.
The computation of random coefficient model is based on the following steps (Swamy, 1971):
- For each regression equation i, Yi = Xiβi + εi,
obtain the OLS estimator of βi:
bi = (Xi'Xi)-1Xi'Yi
Var(bi) = (Xi'Xi)-1(Xi'ΠiXi)(Xi'Xi)-1
= σi2(Xi'Xi)-1+Γ
= Vi+Γ
(Taking account of heteroscedasticity, where Vi = σi2(Xi'Xi)-1)
Note that σi2 is estimated by s2i = ei'ei/(N-K),
where ei = Yi - Xibi.
Then, the estimate of Vi = si2(Xi'Xi)-1.
- For the random coeffcients equation, βi = β + υi,
the variance of bi (estimator of βi) is estimated by
∑i=1,...,G(bi-bm)(bi-bm)'/(G-1)
= ∑i=1,...,G(bibi'-G bmbm')/(G-1),
where bm = ∑i=1,...,Gbi/G.
Therefore, Γ = ∑i=1,...,G(bibi'-G bmbm')/(G-1) -
∑i=1,...,GVi/G
Concerning the possibility that Γ may be nonpositive definite, we use
Γ = ∑i=1,...,G(bibi'-G bmbm')/(G-1).
- Write the GLS estimator of β as:
b* = (X'V-1X)-1X'V-1Y
= [∑i=1,...,GXi'ΠiXi]-1
[∑i=1,...,GXi'ΠiYi]
= [∑i=1,...,GXi'ΠiXi]-1
[∑i=1,...,GXi'ΠiXibi]
= [∑i=1,...,G(Γ+Vi)-1]-1
[(Γ+Vi)-1bi]
= ∑i=1,...,GWibi,
where Wi = [∑i=1,...,G(Γ+Vi)-1]-1
[(Γ+Vi)-1].
Similarly,
Var(b*) = (X'V-1X)-1
= [∑i=1,...,G(Γ+Vi)-1]-1
The individual parameter vectors may be predicted as follows:
bi* = (Γ+Vi)-1[Γ-1b*+Vi-1bi]
= Aib* + (I-Ai)bi,
where Ai = (Γ+Vi)-1Γ-1.
Var(bi*) =
[Ai | | I-Ai] |
|
∑i=1,2,...,GWi(Γ+Vi)Wi' | | Wi(Γ+Vi) |
(Γ+Vi)Wi' | | (Γ+Vi) |
|
|
|
|
|
Generalizing from the univariate time series AR(1) model:
Yt = μ + ρYt-1 +
εt
the mutivariate system of G variables can be written as follows:
Yit = μi + ∑j=1,2,...,G
ρijYj,t-1 + εit (i=1,2,...,G)
This is called Vector Autocorrelation of order 1, or VAR(1).
The matrix representation of the model as a simultaneous linear equations system looks like this:
[Y1t,Y2t,...,YGt] =
[μ1,μ2,...,μG] +
[Y1,t-1,Y2,t-1,...,YG,t-1]
|
|
ρ11 | ρ21 | .. | ρG1 |
ρ12 | ρ22 | .. | ρG2 |
: | : | : | : |
ρ1G | ρ2G | .. | ρGG |
|
|
+
|
[ε1,ε2,...,εG]
|
The alternative is the stacked form suitable for estimation as a system of regression equations:
|
|
|
=
|
|
|
|
+
|
|
ρ11 | ρ12 | .. | ρ1G |
ρ21 | ρ22 | .. | ρ2G |
: | : | : | : |
ρG1 | ρG2 | .. | ρGG |
|
|
|
|
|
+
|
|
|
|
In a shorthand notation,
Yt = μ + ρ Yt-1 +
εt
Extension: VAR(p)
First, we can write the univariate AR(p) model as the system:
Yt = μ + ρ1Yt-1 +
ρ2Yt-2 + ... +ρpYt-p +
εt
Yt-1 = Yt-1
Yt-2 = Yt-2
:
Yt-p+1 = Yt-p+1
Or,
|
|
|
=
|
|
|
|
+
|
|
ρ1 | ρ2 | .. | ρp |
1 | 0 | .. | 0 |
: | : | : | : |
0 | .. | 1 | 0 |
|
|
|
|
|
+
|
|
|
|
That is,
Yt = μ + ρ Yt-1 +
εt
This is a system of p equations with restricted parameters matrix. The usable time series
observations are from p+1 to N (N-p in total).
Similarly, for the multivariate VAR(p) system, the model can be expressed in terms
of the stacked G endogenous variables. Therefore, Yt, Yt-1, ...,
and Yt-p are Gx1 vectors. The size of the problem is (N-p)Gp.
Then the parameter matrix ρ of the lag variable
Yt-1 is
ρ = |
|
ρ1 | ρ2 | .. | .. | ρp |
I | 0 | .. | .. | 0 |
0 | I | : | : | 0 |
0 | 0 | .. | I | 0 |
|
|
where, for each k = 1,2,...,p, ρk =
[ρij,k (i,j=1,2...,G)].
Furthermore, I is GxG identity matrix, and 0 is GxG zeros matrix.
Impulse Response Functions
Deriving from a general VAR(1) system,
Yt = μ + ρ Yt-1 +
εt, we write:
[I-ρ(B)]Yt =
μ + εt
where B is the backshift operator. Then,
Yt = [I-ρ]-1μ +
∑i=0,2...,∞
ρiεt-i
= Y* + (εt +
ρ1εt-1 +
ρ2εt-2 + ...)
Y* is the equilibrium and εt
is the innovation. By shocking one element of εt,
says εjt, Yt will move away from the equilibrium
Y*. Note that the effect of Yt due to change of εjt
is not just on the jth variable alone but also on other variables in the system.
The path whereby the variables returns to equilibirum is called the Impulse Responses of
a stable VAR system. The Impulse Response Function traces the effects of a one-time
innovation εjt on the k-th variable over time (i=0,1,2,...) as
ρikj (k,j = 1,2,...,G).
Copyright © Kuan-Pin Lin
Last updated: 1/24/2012