Simultaneous Linear Equations System

Table of Contents

Introduction

The Model

Identification

Estimation

Example: Klein's Model I


Introduction

Variables

Equations


The Model

Notations

N Number of observations (i=1,2,...,N)
G Number of equations (endogenous variables) (j=1,2,...,G)
K Number of predetermined variables (k=1,2,...,K)
Gj Number of RHS endogenous variables in the equation j;
Gj+1 is the number of endogenous variables in the equation j
Kj Number of predetermined variables in the equation j
Gj* Number of endogenous variables not in the equation j
Gj+Gj*+1 = G
Kj* Number of predetermined variables not in the equation j
Kj+Kj* = K
Y NxG Data matrix of endogenous variables
X NxK Data matrix of predetermined variables
Z Z=Y~X, Nx(G+K) Data matrix of all variables
B GxG parameter (sparse) matrix associated with Y
Note: Bjj = -1 (normalization)
Γ KxG parameter (sparse) matrix associated with X
Δ Δ=B|Γ, (G+K)xG parameter (sparse) matrix associated with Z
U, V NxG error matrices

Model Representations

Model Assumptions


Identification

Consider the j-th stochastic equation of a linear system model, its reduced form reprensentation yj = Y.j = XΠ.j + V.j can be estimated consistently using ordinary least squares. That is,

Π.j = (X'X)-1X'yj

Given the parameter estimator of Π.j for each equation j, can we derive or solve the corresponding structural paramters B.j and Γ.j through the non-linear relationship Π = -ΓB-1?

The j-th stochastic equation is identified if the structural paramters B.j and Γ.j are derivable from the reduced form parameters in Π. An identity equation is automatically identified. A linear system model is identified if all the stochastic equations are identified.

Order Condition

From the structural and reduced form parameters relationship for the j-th equation ΠB.j = -Γ.j, or ΠB.j + Γ.j = 0:

[Π   I]
B.j
Γ.j
= 0

Where Π (KxG parameter matrix)
I (KxK identity matrix)
Β.j (Gx1 parameter vector)
Γ.j (Kx1 parameter vector)
Since one element of B.j is -1 (normalization), and many elements of B.j and Γ.j are 0 (zero restrictions), there are Gj+Kj unknowns must be solved from K rows of [Π, I]. In other words, there must be at least Gj+Kj equations to find a set of solution for the unknown elements of B.j and Γ.j. That is,

K ≥ Gj+Kj or Kj* ≥ Gj.

Equivalently, Kj* + Gj* ≥ G-1, since G = Gj + Gj* + 1.

Rank Condition

In more details, for the j-th equation, the parameter relationship ΠB.j = -Γ.j can be re-arranged as follows:

Π1  Π2  Π3 
Π1* Π2* Π3*
|
-1
βj
0
 |
= -
γj
0

Remember that yj = Yjβj + Xjγj + εj.
Where Π1 (Kjx1 scalar)
Π2 (KjxGj matrix)
Π3 (KjxGj* matrix)
Π1* (Kj*x1 vector)
Π2* (Kj*xGj matrix)
Π3* (Kj*xGj* matrix)
Solving βj and γj from the reduced form parameters in Π can be accomplished from solving the following two set of equations:

  1. Π1 - Π2βj = γj
    Gj+Kj unknowns (βj and γj) in Kj equations.
  2. Π1* - Π2*βj = 0
    Gj unknowns (βj) in Kj* equations.
From (2), Π2*βj = Π1*. Solving βj with the least squares: βj = (Π2*2*)-1Π2*1*, the full rank condition is required. That is,

rank([Π1* Π2*]) = rank(Π2*) = Gj.

Once βj is solved, γj is obtained from (1).

In practice, the rank condition as derived is difficult to check because the dense matrix Π2* is not known prior estimation. The alternative method is to check the structural parameters in B and Γ in relation with the zeros restrictions for each equation. That is, for each equation j, there must exist a matrix of rank G-1 obtained from the non-zero coefficients appeared in the other equations but not in the jth equation.


Estimation

Limited Information Estimation

Consider the structural equation j, YB.j + XΓ.j = U.j or yj = Yjβj + Xjγj + εj. Let's write the j-th equation for estimation as the following

yj = Zjδj + εj

where Zj = [Yj Xj] and δj = [βj γj]. Denote dj as the estimator of δj.

  1. Ordinary Least Squares
    dj = (Z'jZj)-1Z'jyj
    Var(dj) = s2j(Z'jZj)-1
    s2j = e'jej/N is the estimate of σ2j;
    ej = yj - Zjdj is the estimated residuals.

    Note: the OLS estimator of δj (that is, dj) is biased and inconsistent due to the random regressors problem (because in general there are RHS endogenous variables in the equation). The method of instrumental variables is recommended instead. The appropriate instrumental variables for the RHS endogenous variables can be constructed from the least squares estimator of Π.j (that is, (X'X)-1X'yj), for the reduced form equation: yj = Y.j=XΠ.j+V.j.

  2. Two Stage Least Squares
    For the j-th equation, subsitituting the RHS endogenous variables Yj with the instrumental variables X(X'X)-1X'Yj (that is, the fitted Yj). Write the j-th equation for estimation as:

    yj = Wjδj + εj

    where Wj = [X(X'X)-1X'Yj   Xj]. Recall that Zj = [Yj Xj] and thus W'jZj = W'jWj. Then the 2SLS estimator of δj is the following:

    dj = (W'jZj)-1W'jyj = (Z'j[X(X'X)-1X']Zj)-1Z'j[X(X'X)-1X']yj
    Var(dj) = s2j(W'jZj)-1 = s2j(Z'j[X(X'X)-1X']Zj)-1
    s2j = e'jej/N, and
    ej = yj - Zjdj

    Note: the 2SLS estimator of δj (that is, dj) does not take account of cross equation correlation although the instrumental variables are obtained from all the predetermined variables in the model.

  3. Limited Information Maximum Likelihhod
    Consider the reduced form revelant to the j-th structural equation: Y0j = XΠ0j + V0j where Y0j = [yj Yj] and Π0j is the corresponding reduced form parameter matrix.

    By assuming the normal distribution for the reduced form error matrix V0j with zero mean and variance-covariance matrix Ω0j, LIML estimator of δ is obtained from maximizing the log-likelihood function:

    L(Π0j) = -½ N ((Gj+1)log(2π) + log(|Ω0j|) + trace[(Y0j - XΠ0j)'Ω0j-1(Y0j - XΠ0j])

    subject to the identification constraint:

    ΠB.j = -Γ.j

    LIML estimator is the same as the least variance ratio estimator, which is a special case of k-class estimator.

Variance-Covariance Matrix

  1. Variance-Covariance Matrix Across Equations
    Let e = [e1, e2, ..., eG], where the estimated residual is ej = yj - Zjdj (j=1,2,...,G). Then the estimated variance-covariance matrix for Σ is a GxG matrix defined as:

    S = e'e/N.

  2. Variance-Covariance Matrix of Parameters
    Extending from the estimated variance-covariance matrix of parameters Var(dj) for j=1,2,...,G, the variance-covariance matrix of all parameters is defined as:

    Var(d) = [ smn(W'mZm)-1(W'mZn)(W'nZn)-1, m,n=1,2,...,G ]

    Where smn is the (m,n)-th element of S, and smm=s2m, snn=s2n.

Full Information Estimation

Limited information estimation techniques such as 2SLS and LIML does not take account of cross equation correlations embedded in the system as a whole.
  1. Three Stage Least Squares
    From the 2SLS estimation for the j-th equation: yj = Wjδj + εj, where
    Wj = [X(X'X)-1X'Yj   Xj] and Zj = [Yj Xj] and δj = [βj γj].

    By stacking all the stochastic equations yj = Wjδj + εj (j=1,2,...,G) as follows:

    |
    |
    y1
    y2
    :
    yG
     |
     |
    =
    |
    |
    W10..0
    0W2..0
    ::::
    00..WG
     |
     |
    |
    |
    δ1
    δ2
    :
    δG
     |
     |
    +
    |
    |
    ε1
    ε2
    :
    εG
     |
     |

    Write the above stacked-equation system as: y = Wδ + ε, where
    y NGx1 data vector
    W NGx(Σj=1,2,...,GGj+Kj) data matrix
    δ j=1,2,...,GGj+Kj)x1 parameter vector
    ε NGx1 error vector

    The error structure ε satisfies:

    E(ε) = 0 and
    Var(ε) = E(εε') = Σ⊗I = [σijI   (i,j=1,2,...,G)]

    ε is clearly heterogeneous and correlated across equations. Denote d as the Generalized Least Squares (GLS) estimator of δ. Then

    d = [W'(S-1⊗I)Z]-1W'(S-1⊗I)y

    =
    |
    |
    s11W1'Z1  s12W1'Z2  ..  s1GW1'ZG
    s21W2'Z1  s22W2'Z2  ..  s2GW2'ZG
    ::::
    sG1WG'Z1  sG2WG'Z2  ..  sGGWG'ZG
     |
     |
    -1
     
     
     
    |
    |
    j=1,2,...,G s1jW1'yj
    j=1,2,...,G s2jW2'yj
    :
    j=1,2,...,G sGjWG'yj
     |
     |

    Var(d) = [W'(S-1⊗I)Z]-1

    S = e'e/N is the estimated variance-covariance matrix Σ, where e = [e1, e2, ..., eG] and the estimated residual is ej = yj - Zjdj (j=1,2,...,G). Furthermore, S-1 denotes the inverse of S with the element sjk (j,k=1,2,...,G).

    Note: Since S-1 depends on d, iterations of 3SLS may be performed until convergence.

  2. Full Information Maximum Likelihood
    Assuming normal distribution of the serially independent residuals with zero mean and positive definite variance-covariance matrix Σ, the concentrated log-likelihood function for the system model YB + XΓ = U is

    L*(B,Γ) = -½ NG(1 + log(2π)) + N log(|B|) - ½ N log(|(YB+XΓ)'(YB+XΓ)|/N)

    Since log(|B|) = ½ log(|B'Y'YB|) - ½ log(|Y'Y|), we can also write

    L*(B,Γ) = -½ NG(1 + log(2π)) - ½ N log(|Y'Y|)
    + ½ N log(|B'Y'YB|/N) - ½ N log(|(YB+XΓ)'(YB+XΓ)|/N)

    Instrumental Variables Method
    FIML estimator using IV method is obtained by maximizing
    L*1(B,Γ) = N log(|B|) - ½ N log(|(YB+XΓ)'(YB+XΓ)/N|)

    The first derivatives of L*1(B,Γ) are used to set up the normal equations similar to the iterative 3SLS estimation. Let S = |(YB+XΓ)'(YB+XΓ)|/N, the normal equations for maximizing L*1(B,Γ) are:

    ∂L*1/∂B = NB'-1 - Y'(YB+XΓ)S-1 = 0
    ∂L*1/∂Γ = -X'(YB+XΓ)S-1 = 0

    By subsitituting out N, combining terms, and using the parameter restrictions Π = -ΓB-1 in the first equation, it can be re-written as follows:

    ∂L*1/∂B = - Π'X'(YB+XΓ)S-1 = 0

    Together with the second equation, the normal equation in matrix form looks like this:

    [XΓB-1   X]'(YB+XΓ)S-1 = 0

    We need to re-arrange the equations and parameters, and define Wj* = [(-XΓB-1)j   Xj] and write the typical j-th equation: yj = Wj*δj + εj (j=1,2,...,G). The corresponding stacked-equations system y = W*δ + ε:

    |
    |
    y1
    y2
    :
    yG
     |
     |
    =
    |
    |
    W1*0..0
    0W2*..0
    ::::
    00..WG*
     |
     |
    |
    |
    δ1
    δ2
    :
    δG
     |
     |
    +
    |
    |
    ε1
    ε2
    :
    εG
     |
     |

    As in the 3SLS, the FIML estimator for δ is:

    d = [W*'(S-1⊗I)Z]-1W*'(S-1⊗I)y

    =
    |
    |
    s11W1*'Z1  s12W1*'Z2  ..  s1GW1*'ZG
    s21W2*'Z1  s22W2*'Z2  ..  s2GW2*'ZG
    ::::
    sG1WG*'Z1  sG2WG*'Z2  ..  sGGWG*'ZG
     |
     |
    -1
     
     
     
    |
    |
    j=1,2,...,G s1jW1*'yj
    j=1,2,...,G s2jW2*'yj
    :
    j=1,2,...,G sGjWG*'yj
     |
     |

    Var(d) = [W*'(S-1⊗I)Z]-1

    S = e'e/N and e = [e1, e2, ..., eG] with ej = yj - Zjdj (j=1,2,...,G).

    Linearized ML Method
    FIML estimator using the linearized ML method is obtained by maximizing
    L*2(B,Γ) = log(|B'Y'YB|/N) - log(|(YB+XΓ)'(YB+XΓ)|/N)

    Let Q = |B'Y'YB|/N and S = |(YB+XΓ)'(YB+XΓ)|/N, then the normal equations for maximizing L*2(B,Γ) are:

    ∂L*2/∂B = Y'YBQ-1 - Y'(YB+XΓ)S-1 = 0
    ∂L*2/∂Γ = -X'(YB+XΓ)S-1 = 0

    By re-arranging the equations and parameters, and let Zj = [Yj   Xj] and Z0j = [Yj   0].

    Define Z =
    |
    |
    Z10..0
    0Z2..0
    ::::
    00..ZG
     |
     |
      and Z0 =
    |
    |
    Z010..0
    0Z02..0
    ::::
    00..Z0G
     |
     |

    Then the FIML estimator of δ is derived from the following

    d = [Z'(S-1⊗I)Z - Z'0(Q-1⊗I)Z0]-1 [Z'(S-1⊗I)y - Z'0(Q-1⊗I)y]

    Where S = e'e/N, e = [e1, e2, ...,eG], and ej = yj - Zjdj (j=1,2,...G). Similarly, Q = e0'e0/N, e0 = [e01, e02, ...,e0G], and e0j = yj - Z0jdj (j=1,2,...,G).

    Newton Method
    Both the first derivatives (gradient) and second derivatives (hessian) of L*2(B,Γ) are used in the iterative estimation.

Applications

Dynamic Model Simulation

Deriving from the structural from: YB + XΓ = U, the reduced form is Y = XΠ + V where Π = -ΓB-1 and V = UB-1. Since the predetermined variables X may include lagged endogenous variables and current and lagged exogenous variables, we can write:

Y = Y-1Π1 + XΠ2 + V

From now on, X denotes the data matrix of current and lagged exogenous variables and Y-1 includes lagged endogenous variables. Then,

Π =
Π1
Π2

The stability of the model requires that the characteristic roots of Π1 lie inside the unit circle. A plot of the period (dynamic) multipliers against the lag length is call the Impulse Response Function.

Example: Klein's Model I


Copyright © Kuan-Pin Lin
Last updated: 1/24/2012