EC 571 Advanced Econometrics

Topic: Simultaneous Linear Equations System

The Model

Notations
Model Representations
Model Assumptions

Estimation

Limited Information Estimation
1. Ordinary Least Squares
2. Two Stage Least Squares
3. Limited Information Maximum Likelihood
Variance-Covariance Matrix
1. Variance-Covariance Matrix Across Equations
2. Variance-Covariance Matrix of Parameters
Full Information Estimation
1. Three Stage Least Squares
2. Full Information Maximum Likelihood
  1. Instrumental Variables Method
  2. Linearized ML Method
  3. Newton Method

Example: Klein's Model I

Data File (Greene [1999], Chapter 16)
- Klein.txt
GAUSS Program
- Klein Model I

Introduction

Variables

Endogenous Variables
Predetermined Variables
1. Exogenous Variables
2. Lagged Variables (Dynamic Models)
  1. Lagged Endogenous Variables
  2. Lagged Exogenous Variables

Equations

Stochastic Equations
Identities

The Model

Notations

N	Number of observations (i=1,2,...,N)
G	Number of equations (endogenous variables) (j=1,2,...,G)
K	Number of predetermined variables (k=1,2,...,K)
G_j	Number of RHS endogenous variables in the equation j; G_j+1 is the number of endogenous variables in the equation j
K_j	Number of predetermined variables in the equation j
G_j^*	Number of endogenous variables not in the equation j G_j+G_j^*+1 = G
K_j^*	Number of predetermined variables not in the equation j K_j+K_j^* = K
Y	NxG Data matrix of endogenous variables
X	NxK Data matrix of predetermined variables
Z	Z=Y~X, Nx(G+K) Data matrix of all variables
B	GxG parameter (sparse) matrix associated with Y Note: B_jj = -1 (normalization)
G	KxG parameter (sparse) matrix associated with X
D	D=B\|G, (G+K)xG parameter (sparse) matrix associated with Z
U, V	NxG error matrices

Model Representations

Structural Model

YB + XG = U (or ZD = U)

Y_i.B + X_i.G = U_i. (i=1,2,...,N)

YB_.j + XG_.j = U_.j (j=1,1,...,G)

The equation specification may be expressed as

y_j = Y_jb_j + X_jg_j + e_j

Note: e_j = 0 if the j-th equation is an identity.

Special Cases:

(Block) Recursive Systems
B is a (block) triangular matrix
Seemingly Unrelated Systems
B = -I (negative identity matrix), or
y_j = X_jg_j + e_j (j=1,2,...G)
- VAR (Vector AutoRegression)
- Panel Data

Reduced Form

Y = XP + V

Y_i. = X_i.P+V_i. (i=1,2,...,N)

Y_.j = XP_.j+V_.j (j=1,2,...,G)

Where P = -GB^-1 is a KxG parameter matrix and V = UB^-1 is a NxG error matrix, derived from the structural model. Equivalently, for each equation j (j=1,2,...G), PB_.j = -G_.j

Model Assumptions

For each equation j=1,2,...,G
E(U_.j) = 0 (Nx1 zero vector)
Var(U_.j) = E(U'_.jU_.j) = s²_jI
(NxN diagnal matrix)
For each observation i=1,2,...,N
E(U_i.) = 0 (1xG zero vector)
Var(U_i.) = E(U'_i.U_i.) = [s_mn, m,n=1,2,...G] = S
(GxG positive definite symmetric matrix with the j-th diagnal element s_jj = s²_j)
Derived assumptions on the reduced form, for each observation i=1,2,...,N
E(V_i.) = E(U_i.B^-1) = 0 (1xG zero vector)
Var(V_i.) = B^-1'E(U'_i.U_i.)B^-1 = B^-1'SB^-1 = W
(GxG positive definite symmetric matrix)

Identification

Consider the j-th stochastic equation of a linear system model, its reduced form reprensentation y_j = Y_.j = XP_.j + V_.j can be estimated consistently using ordinary least squares. That is,

P_.j = (X'X)^-1X'y_j

Given the parameter estimator of P_.j for each equation j, can we derive or solve the corresponding structural paramters B_.j and G_.j through the non-linear relationship P = -GB^-1?

The j-th stochastic equation is identified if the structural paramters B_.j and G_.j are derivable from the reduced form parameters in P. An identity equation is automatically identified. A linear system model is identified if all the stochastic equations are identified.

Order Condition

From the structural and reduced form parameters relationship for the j-th equation PB_.j = -G_.j, or PB_.j + G_.j = 0:

[P I]

é

ë

B_.j

G_.j

ù

û

= 0

Where P (KxG parameter matrix)

I (KxK identity matrix)

B_.j (Gx1 parameter vector)

G_.j (Kx1 parameter vector)

Since one element of B_.j is -1 (normalization), and many elements of B_.j and G_.j are 0 (zero restrictions), there are G_j+K_j unknowns must be solved from K rows of [P, I]. In other words, there must be at least G_j+K_j equations to find a set of solution for the unknown elements of B_.j and G_.j. That is,

K ³ G_j+K_j or K_j^* ³ G_j.

Equivalently, K_j^* + G_j^* ³ G-1, since G = G_j + G_j^* + 1.

Rank Condition

In more details, for the j-th equation, the parameter relationship PB_.j = -G_.j can be re-arranged as follows:

é

ë

P₁ P₂ P₃

P₁^* P₂^* P₃^*

ù

û

é

ê

ë

-1

b_j

0

ù

ú

û

= -

é

ë

g_j

0

ù

û

Remember that y_j = Y_jb_j + X_jg_j + e_j.

Where P₁ (K_jx1 scalar)

P₂ (K_jxG_j matrix)

P₃ (K_jxG_j^* matrix)

P₁^* (K_j^*x1 vector)

P₂^* (K_j^*xG_j matrix)

P₃^* (K_j^*xG_j^* matrix)

Solving b_j and g_j from the reduced form parameters in P can be accomplished from solving the following two set of equations:

P₁ - P₂b_j = g_j
G_j+K_j unknowns (b_j and g_j) in K_j equations.
P₁^* - P₂^*b_j = 0
G_j unknowns (b_j) in K_j^* equations.

From (2), P₂^*b_j = P₁^*. Solving b_j with the least squares: b_j = (P₂^*'P₂^*)^-1P₂^*'P₁^*, the full rank condition is required. That is,

rank([P₁^* P₂^*]) = rank(P₂^*) = G_j.

Once b_j is solved, g_j is obtained from (1).

In practice, the rank condition as derived is difficult to check because the dense matrix P₂^* is not known prior estimation. The alternative method is to check the structural parameters in B and G in relation with the zeros restrictions for each equation. That is, for each equation j, there must exist a matrix of rank G-1 obtained from the non-zero coefficients appeared in the other equations but not in the jth equation.

Estimation

Limited Information Estimation

Consider the structural equation j, YB_.j + XG_.j = U_.j or y_j = Y_jb_j + X_jg_j + e_j. Let's write the j-th equation for estimation as the following

y_j = Z_jd_j + e_j

where Z_j = [Y_j X_j] and d_j = [b_j g_j]. Denote d_j as the estimator of d_j.

Ordinary Least Squares
d_j = (Z'_jZ_j)^-1Z'_jy_j
Var(d_j) = s²_j(Z'_jZ_j)^-1
s²_j = e'_je_j/N is the estimate of s²_j;
e_j = y_j - Z_jd_j is the estimated residuals.
Note: the OLS estimator of d_j (that is, d_j) is biased and inconsistent due to the random regressors problem (because in general there are RHS endogenous variables in the equation). The method of instrumental variables is recommended instead. The appropriate instrumental variables for the RHS endogenous variables can be constructed from the least squares estimator of P_.j (that is, (X'X)^-1X'y_j), for the reduced form equation: y_j = Y_.j=XP_.j+V_.j.
Two Stage Least Squares
For the j-th equation, subsitituting the RHS endogenous variables Y_j with the instrumental variables X(X'X)^-1X'Y_j (that is, the fitted Y_j). Write the j-th equation for estimation as:
y_j = W_jd_j + e_j
where W_j = [X(X'X)^-1X'Y_j X_j]. Recall that Z_j = [Y_j X_j] and thus W'_jZ_j = W'_jW_j. Then the 2SLS estimator of d_j is the following:
d_j = (W'_jZ_j)^-1W'_jy_j = (Z'_j[X(X'X)^-1X']Z_j)^-1Z'_j[X(X'X)^-1X']y_j
Var(d_j) = s²_j(W'_jZ_j)^-1 = s²_j(Z'_j[X(X'X)^-1X']Z_j)^-1
s²_j = e'_je_j/N, and
e_j = y_j - Z_jd_j
Note: the 2SLS estimator of d_j (that is, d_j) does not take account of cross equation correlation although the instrumental variables are obtained from all the predetermined variables in the model.
Limited Information Maximum Likelihhod
Consider the reduced form revelant to the j-th structural equation: Y⁰_j = XP⁰_j + V⁰_j where Y⁰_j = [y_j Y_j] and P⁰_j is the corresponding reduced form parameter matrix.
By assuming the normal distribution for the reduced form error matrix V⁰_j with zero mean and variance-covariance matrix W⁰_j, LIML estimator of d is obtained from maximizing the log-likelihood function:
L(P⁰_j) = -½ N ((G_j+1)log(2p) + log(|W⁰_j|) + trace[(Y⁰_j - XP⁰_j)'W⁰_j^-1(Y⁰_j - XP⁰_j])
subject to the identification constraint:
PB_.j = -G_.j
LIML estimator is the same as the least variance ratio estimator, which is a special case of k-class estimator.

Variance-Covariance Matrix

Variance-Covariance Matrix Across Equations
Let e = [e₁, e₂, ..., e_G], where the estimated residual is e_j = y_j - Z_jd_j (j=1,2,...,G). Then the estimated variance-covariance matrix for S is a GxG matrix defined as:
S = e'e/N.
Variance-Covariance Matrix of Parameters
Extending from the estimated variance-covariance matrix of parameters Var(d_j) for j=1,2,...,G, the variance-covariance matrix of all parameters is defined as:
Var(d) = [ s_mn(W'_mZ_m)^-1(W'_mZ_n)(W'_nZ_n)^-1, m,n=1,2,...,G ]
Where s_mn is the (m,n)-th element of S, and s_mm=s²_m, s_nn=s²_n.

Full Information Estimation

Limited information estimation techniques such as 2SLS and LIML does not take account of cross equation correlations embedded in the system as a whole.

Three Stage Least Squares
From the 2SLS estimation for the j-th equation: y_j = W_jd_j + e_j, where
W_j = [X(X'X)^-1X'Y_j X_j] and Z_j = [Y_j X_j] and d_j = [b_j g_j].

By stacking all the stochastic equations y_j = W_jd_j + e_j (j=1,2,...,G) as follows:

é

ê

ê

ë

y₁

y₂

:

y_G

ù

ú

ú

û

=

é

ê

ê

ë

W₁ 0 .. 0

0 W₂ .. 0

: : : :

0 0 .. W_G

ù

ú

ú

û

é

ê

ê

ë

d₁

d₂

:

d_G

ù

ú

ú

û

+

é

ê

ê

ë

e₁

e₂

:

e_G

ù

ú

ú

û

Write the above stacked-equation system as: y = Wd + e, where

y NGx1 data vector

W NGx(S_j=1,2,...,GG_j+K_j) data matrix

d (S_j=1,2,...,GG_j+K_j)x1 parameter vector

e NGx1 error vector

The error structure e satisfies:

E(e) = 0 and
Var(e) = E(ee') = S Ä I = [s_ijI (i,j=1,2,...,G)]

e is clearly heterogeneous and correlated across equations. Denote d as the Generalized Least Squares (GLS) estimator of d. Then

d = [W'(S^-1ÄI)Z]^-1W'(S^-1ÄI)y

=

é

ê

ê

ë

s¹¹W₁'Z₁ s¹²W₁'Z₂ .. s^1GW₁'Z_G

s²¹W₂'Z₁ s²²W₂'Z₂ .. s^2GW₂'Z_G

: : : :

s^G1W_G'Z₁ s^G2W_G'Z₂ .. s^GGW_G'Z_G

ù

ú

ú

û

^-1

é

ê

ê

ë

å_j=1,2,...,G s^1jW₁'y_j

å_j=1,2,...,G s^2jW₂'y_j

:

å_j=1,2,...,G s^GjW_G'y_j

ù

ú

ú

û

Var(d) = [W'(S^-1ÄI)Z]^-1

S = e'e/N is the estimated variance-covariance matrix S, where e = [e₁, e₂, ..., e_G] and the estimated residual is e_j = y_j - Z_jd_j (j=1,2,...,G). Furthermore, S^-1 denotes the inverse of S with the element s^jk (j,k=1,2,...,G).

Note: Since S^-1 depends on d, iterations of 3SLS may be performed until convergence.

Full Information Maximum Likelihood
Assuming normal distribution of the serially independent residuals with zero mean and positive definite variance-covariance matrix S, the concentrated log-likelihood function for the system model YB + XG = U is

L^*(B,G) = -½ NG(1 + log(2p)) + N log(|B|) - ½ N log(|(YB+XG)'(YB+XG)|/N)

Since log(|B|) = ½ log(|B'Y'YB|) - ½ log(|Y'Y|), we can also write

Instrumental Variables Method
FIML estimator using IV method is obtained by maximizing
L^*1(B,G) = N log(|B|) - ½ N log(|(YB+XG)'(YB+XG)/N|)

The first derivatives of L^*1(B,G) are used to set up the normal equations similar to the iterative 3SLS estimation. Let S = |(YB+XG)'(YB+XG)|/N, the normal equations for maximizing L^*1(B,G) are:

¶L^*1/¶B = NB'^-1 - Y'(YB+XG)S^-1 = 0
¶L^*1/¶G = -X'(YB+XG)S^-1 = 0

By subsitituting out N, combining terms, and using the parameter restrictions P = -GB^-1 in the first equation, it can be re-written as follows:

¶L^*1/¶B = - P'X'(YB+XG)S^-1 = 0

Together with the second equation, the normal equation in matrix form looks like this:

[XGB^-1 X]'(YB+XG)S^-1 = 0

We need to re-arrange the equations and parameters, and define W_j^* = [(-XGB^-1)_j X_j] and write the typical j-th equation: y_j = W_j^*d_j + e_j (j=1,2,...,G). The corresponding stacked-equations system y = W^*d + e:

é

ê

ê

ë

y₁

y₂

:

y_G

ù

ú

ú

û

=

é

ê

ê

ë

W₁^* 0 .. 0

0 W₂^* .. 0

: : : :

0 0 .. W_G^*

ù

ú

ú

û

é

ê

ê

ë

d₁

d₂

:

d_G

ù

ú

ú

û

+

é

ê

ê

ë

e₁

e₂

:

e_G

ù

ú

ú

û

As in the 3SLS, the FIML estimator for d is:

d = [W^*'(S^-1ÄI)Z]^-1W^*'(S^-1ÄI)y

=

é

ê

ê

ë

s¹¹W₁^*'Z₁ s¹²W₁^*'Z₂ .. s^1GW₁^*'Z_G

s²¹W₂^*'Z₁ s²²W₂^*'Z₂ .. s^2GW₂^*'Z_G

: : : :

s^G1W_G^*'Z₁ s^G2W_G^*'Z₂ .. s^GGW_G^*'Z_G

ù

ú

ú

û

^-1

é

ê

ê

ë

å_j=1,2,...,G s^1jW₁^*'y_j

å_j=1,2,...,G s^2jW₂^*'y_j

:

å_j=1,2,...,G s^GjW_G^*'y_j

ù

ú

ú

û

Var(d) = [W^*'(S^-1ÄI)Z]^-1

S = e'e/N and e = [e₁, e₂, ..., e_G] with e_j = y_j - Z_jd_j (j=1,2,...,G).

Linearized ML Method
FIML estimator using the linearized ML method is obtained by maximizing
L^*2(B,G) = log(|B'Y'YB|/N) - log(|(YB+XG)'(YB+XG)|/N)

Let Q = |B'Y'YB|/N and S = |(YB+XG)'(YB+XG)|/N, then the normal equations for maximizing L^*2(B,G) are:

¶L^*2/¶B = Y'YBQ^-1 - Y'(YB+XG)S^-1 = 0
¶L^*2/¶G = -X'(YB+XG)S^-1 = 0

By re-arranging the equations and parameters, and let Z_j = [Y_j X_j] and Z⁰_j = [Y_j 0].

Define Z =

é

ê

ê

ë

Z₁ 0 .. 0

0 Z₂ .. 0

: : : :

0 0 .. Z_G

ù

ú

ú

û

and Z⁰ =

é

ê

ê

ë

Z⁰₁ 0 .. 0

0 Z⁰₂ .. 0

: : : :

0 0 .. Z⁰_G

ù

ú

ú

û

Then the FIML estimator of d is derived from the following

d = [Z'(S^-1ÄI)Z - Z'⁰(Q^-1ÄI)Z⁰]^-1 [Z'(S^-1ÄI)y - Z'⁰(Q^-1ÄI)y]

Where S = e'e/N, e = [e₁, e₂, ...,e_G], and e_j = y_j - Z_jd_j (j=1,2,...G). Similarly, Q = e⁰'e⁰/N, e⁰ = [e⁰₁, e⁰₂, ...,e⁰_G], and e⁰_j = y_j - Z⁰_jd_j (j=1,2,...,G).

Newton Method
Both the first derivatives (gradient) and second derivatives (hessian) of L^*2(B,G) are used in the iterative estimation.

Applications

Dynamic Model Simulation

Deriving from the structural from: YB + XG = U, the reduced form is Y = XP + V where P = -GB^-1 and V = UB^-1. Since the predetermined variables X may include lagged endogenous variables and current and lagged exogenous variables, we can write:

Y = Y_-1P₁ + XP₂ + V

From now on, X denotes the data matrix of current and lagged exogenous variables and Y_-1 includes lagged endogenous variables. Then,

P =

é

ë

P₁

P₂

ù

û

Impact Multipliers: P₂
Dynamic Multipliers: P₂P₁, P₂P₁², ..., P₂P₁^t, ...
Equilibirum Multipliers: P₂[I+P₁+P₁²+...] = P₂[I-P₁]^-1

The stability of the model requires that the characteristic roots of P₁ lie inside the unit circle. A plot of the period (dynamic) multipliers against the lag length is call the Impulse Response Function.

Instructor: Professor Kuan-Pin Lin (kuan-pin@eclab.econ.pdx.edu)
Date: 3-5-2002

Where	P	(KxG parameter matrix)
	I	(KxK identity matrix)
	B_.j	(Gx1 parameter vector)
	G_.j	(Kx1 parameter vector)

Where	P₁	(K_jx1 scalar)
	P₂	(K_jxG_j matrix)
	P₃	(K_jxG_j^* matrix)
	P₁^*	(K_j^*x1 vector)
	P₂^*	(K_j^*xG_j matrix)
	P₃^*	(K_j^xG_j^ matrix)

y	NGx1 data vector
W	NGx(S_j=1,2,...,GG_j+K_j) data matrix
d	(S_j=1,2,...,GG_j+K_j)x1 parameter vector
e	NGx1 error vector