Distributed Lag Models

Introduction

To keep the model presentation simple, a general distributed lag model is written as:

Yi = α + ∑j=0,1,...,∞ βjXi-j + εi

Define the lag weights as

wj = βj / β, j = 0,1,2,...

so that ∑j=0,1,...,∞ wj = 1, and write the model as:

Yi = α + β ∑j=0,1,...,∞ wjXi-j + εi

Based on lag weights, assuming all wj have the same sign and |wj| < 1, the following statistics are useful to characterize the peroid of adjustment to a new equilibrium:

Geometric (Koyck) Lag Models

Suppose an infinite distributed lag model is in the form wj = (1-λ)λj and 0 < λ < 1 is the rate of decline. 0 < (1-λ) < 1 is the rate of adjustment.
Clearly, ∑j=0,1,...,∞ wj = 1, since ∑j=0,1,...,∞ λj = 1/(1-λ). The model can be written as:

Yi = α + β(1-λ) ∑j=0,1,...,∞ λjXi-j + εi

or in the autoregressive form:

Yi = α(1-λ) + β(1-λ)Xi + λYi-1 + (εi-λεi-1)
= α0 + β0Xi + λYi-1 + υi

This model includes a lagged dependent variable and it is autocorrelated because the error is correlated as υi = εi-λεi-1.

Three applications of geometric lag models are considered when there involves the use of proxy variable (e.g., unobservable expectation, equilibrium or long-run variable).

Adaptive Expectation Model

Consider a model with one explanatory variable X:

Yi = α + βXi* + εi

where Xi* is an unobservable expected value of Xi satisfying the following:

Adaptive Expectation Hypothesis
X*i - X*i-1 = (1-λ) (Xi - X*i-1), or
X*i = (1-λ)Xi + λX*i-1
= (1-λ)(Xi+λXi-12Xi-2+...)
= (1-λ)∑j=0,1,...,∞ λjXi-j
where 0 < (1-λ) < 1 is the coefficient of expectation, and the expectation is imperfect.

Combining the model with adaptive expectation hypothesis, we have:
Yi = α + β [(1-λ)Xi+λX*i-1] + εi, or
Yi = α + β ∑j=0,1,...,∞ wjXi-j εi, with wj = (1-λ)λj, j=0,1,2,...

The model is a standard geometric (Koyck) lag model:
Yi = α0 + β0Xi + λYi-1 + υi, where

α0 = α(1-λ)
β0 = β(1-λ)
υi = εi-λεi-1

Partial Adjustment Model

Consider a model with one explanatory variable X:

Yi* = α + βXi + εi

where Yi* is an unobservable equilibrium or long-run desired value of Yi satisfying the following:

Partial Adjustment Hypothesis
Yi - Yi-1 = (1-λ) (Yi* - Yi-1), or
Yi = (1-λ)Yi* + λYi-1
= (1-λ)Yi*+λ[(1-λ)Yi-1*+λYi-2]
= (1-λ)Yi*+λ{(1-λ)Yi-1*+λ[(1-λ)Yi-2*+λYi-2]}
= ...
= (1-λ)∑j=0,1,...,∞ λjYi-j*
where 0 < (1-λ) < 1 is the coefficient of adjustment, and the adjustment is imperfect.

Combining the model with partial adjustment hypothesis, we have:
Yi = (1-λ){α(1+λ+λ2+...) + β(Xi+λXi-12Xi-2+...) + (εi+λεi-12εi-2+...)} , or
Yi = α + β ∑j=0,1,...,∞ wjXi-j + ∑j=0,1,...,∞ wjεi-j, with wj = (1-λ)λj, j=0,1,2,...

The model is a standard geometric (Koyck) lag model:
Yi = α0 + β0Xi + λYi-1 + υi, where

α0 = α(1-λ)
β0 = β(1-λ)
υi = ∑j=0,1,...,∞ wjεi-j - λ∑j=0,1,...,∞ wjεi-j-1
= (1-λ)(∑j=0,1,...,∞ λjεi-j - ∑j=0,1,...,∞ λj+1εi-j-1)
= (1-λ)εi

Error Correction Model

Consider a model with one explanatory variable X:

Yi* = α + βXi + εi

where Yi* is an unobservable equilibrium or long-run desired value of Yi satisfying the following:

Error Correction Hypothesis
Yi - Yi-1 = (1-γ)(Yi*-Yi-1*) + (1-λ)(Yi-1*-Yi-1), where

Yi* - Yi-1* = change in the desired values,
Yi-1* - Yi-1 = previous disequilibirum,
0 < (1-γ) < 1,
0 < (1-λ) < 1

If γ = λ, it is the partial adjustment hypothesis. Combining the model with error correction hypothesis, we have

Yi-Yi-1 = (1-γ)[β(Xi-Xi-1)+(εii-1)] + (1-λ)(α+βXi-1i-1-Yi-1), or
Yi = α0 + β0Xi + β1Xi-1 + λYi-1 + υi, where

α0 = (1-λ)α
β0 = (1-γ)β
β1 = (γ-λ)β
υi = (1-γ)εi+(γ-λ)εi-1

Lagged Dependent Variable

Let X = [X,Y-1], β = [β,ρ]'.
A time series regression with lagged dependent variables (not limited to the first lag of dependent variable) can be written as:

Yi = Xiβ + εi
= Xiβ + ρYi-1 + εi
= ∑j=0,1,...,∞ Xi-jβj + ∑j=0,1,...,∞ ρjεi-j

The model is a clear violation of Assumption 3 for the classical regression model: E(εi|Xi) = E(εi|Xi,Yi-1) ≠ 0. Therefore, the least squares estimators are biased, inconsistent and inefficient.

For a large sample, if X and ε are not contemporarily correlated (that is, plim(X'ε/N) = 0), then the least squares estimators are consistent and asymptotic efficient. However, if X and ε are contemporarily correlated (that is, plim(X'ε/N) ≠ 0), then the least squares estimators are biased, inconsistent and inefficient. Ordinary least squares estimation (OLS) is not recommended for a regression model with lagged dependent variables, in particular for the case of autocorrelation.

Model Estimation

A replacement for X, called Z, may be used to restore the consistency of least sqaures estimation. Z must satisfy the following conditions:

Z is called the instrumental variables for X.

Instrumental Variable Estimation

The estimation consists of two steps as follows:

  1. Formulate a multivariate regression as X = + u, and estimate the parameter vector of δ as:
    d = (Z'Z)-1Z'X, and
    Xp = Zd = Z(Z'Z)-1Z'X.

  2. Using OLS to estimate the model Y = Xpβ + ε:
    b = (Xp'Xp)-1Xp'Y
    = [X'Z(Z'Z)-1Z'X]-1 X'Z(Z'Z)-1Z'Y

It is clear that the selection of instrumental variables is crucial for a successful estimation of the model parameters. In practice, for a lagged dependent variables model, the instrumental variables include the exogenous explanatory variables and their lags as specified by the number of lagged dependent variables used in the regression. Therefore, the instrumental variable estimation (IV) is summarized as:

Define W = Z(Z'Z)-1Z'X, and note that W'X = W'W.
b = (W'X)-1W'Y = [X'Z(Z'Z)-1Z'X]-1 X'Z(Z'Z)-1Z'Y
Var(b) = s2(W'X)-1 = s2[X'Z(Z'Z)-1Z'X]-1
where s2 = e'e/(N-K) and e = Y - Xb.

Two Special Cases

  1. If Z and X has the same number of columns, then
    b = (Z'X)-1Z'Y
    Var(b) = s2(Z'X)-1

  2. If Z = X, then it is OLS.

Autocorrelation with Lagged Dependent Variable

A regression model with lagged dependent variables is complicated with the autocorrelation problem by model construction. To test for the first-order autocorrelation, Durbin-Watson bounds test is biased toward 2 in the case of lagged dependent variables model. For large sample N, define Durbin-H test statistic as follows:

DH = r[N/(1-N Var(b1)]½
= (1-DW/2)[N/(1-N Var(b1)]½ ~ normal(0,1)

where r and DW is obtained from the model:
Yi = Xiβ + εi (Xiincludes lagged dependent variables, and not limited to the first-lag)
εi = ρεi-1 + υi

r is the estimated first-order correlation coefficient ρ. b1 is the estimated parameter of the first-order lagged dependent variable. Var(b1) is the estimated variance of b1. If Var(b1) > 1/N, then DH statistic can not be computed. Instead, Breusch-Godfrey test will be useful for testing autocorrelation of any order.

Instrumental Variable Estimation with Autocorrelation

To attain the consistency for model estimation with lagged dependent variables, instrumental variables must be selected and used in the estimation. To correct for autocorrelation, Cochrane-Orcutt or Prais-Winston iterative procedure may be applied to the consistent estimates of the regression residuals and transform data series accordingly. Each iteration consists of two steps of instrumental variables estimation and generalized least squares to correct for autocorrelation. The selection of instrumental variables must include the sufficient lags of exogenous explanatory variables up to the specified order of autocorrelation and lagged dependent variables. Because the use of instrumental variables, Durbin-Watson bounds test statistic is appropriate for testing the first order autocorrelation in this case.

Polynomial (Almon) Lag Models

Consider a finite distributed lag model as follows:

Yi = α + ∑j=0,1,...,q βjXi-j + εi

Since the lag q may be large, we assume that the delay parameter βj is a p-th order polynominal function for j = 0,1,2,...,q and p ≤ q:

βj = γ0 + γ1j + γ2j2 + ... + γpjp = ∑k=0,1,...,p γk jk

The model can be written in terms of polynominal parameters as:

Yi = α + ∑k=0,1,...,p γk Zik + εi

where Zik = ∑j=0,1,...,q jk Xi-j, k = 0,1,2,...,p.

Example

Consider a 3-rd order polynominal lag model of 4 lags (p=3, q=4):

Yi = α + β0Xi + β1Xi-1 + β2Xi-2 + β3Xi-3 + β4Xi-4 + εi, with
βj = γ0 + γ1j + γ2j2 + ... + γ3j3, j = 0,1,2,3,4. That is,

β0 = γ0
β1 = γ0+ γ1 + γ2 + γ3
β2 = γ0+2γ1+ 4γ2 +8γ3
β3 = γ0+3γ1+ 9γ2+27γ3
β4 = γ0+4γ1+16γ2+64γ3

Equivalently,

Yi = α + γ0Zi0 + γ1Zi1 + γ2Zi2 + γ3Zi3 + εi, where

Zi0 = Xi+Xi-1+Xi-2+Xi-3+Xi-4
Zi1 = Xi-1+ 2Xi-2+ 3Xi-3+ 4Xi-4
Zi2 = Xi-1+ 4Xi-2+ 9Xi-3+ 16Xi-4
Zi3 = Xi-1+ 8Xi-2+ 27Xi-3+64Xi-4

Model Estimation

For a p-th order polynomial q lags model:

βj = ∑k=0,1,...,p γk jk, j = 0,1,2,...,q and p ≤ q

We write β = Hγ, or
|
|
|
β0
β1
β2
:
βq
 |
 |
 |
=
|
|
|
1   0   0   ...   0
1   1   1   ...   1
1   2   4   ...   2p
:   :   :   :   :
1   q   q2  ...   qp
 |
 |
 |
|
|
|
γ0
γ1
γ2
:
γp
 |
 |
 |

End-Point Restriction

Additional end-point (tie-down) restrictions may be imposed:

Least Squares Estimation

From the model Yi = α + ∑j=0,1,...,q βjXi-j + εi, or
Y = α + Xβ + ε
= α + XHγ + ε
= α + Zγ + ε (note Z = XH)
= [1 Z]
α
γ
+ ε
= + ε

The least squares estimator of α = [α,γ]' = [α,γ01,...,γp]' is
a = (Z'Z)-1Z'Y
Var(a) = s2(Z'Z)-1
s2 = e'e/(N-K) and e = Y - Za
K = p+2 (only one explanatory variable X with p-th order polynomial q lags is considered).

The estimators of the original parameters β = [α,β]' = [α,β01,...,βq]' is obtained from β = Hγ, or
b = Ha
Var(b) = H Var(a) H' = s2H(Z'Z)-1H'

Application: Granger Causality

An autoregressive distributed lag (ARDL) model can be used to test the causality in Granger sense for a pair of variables. The question is Does X Granger cause Y or Y Granger cause X?

Let X and Y are expressed in deviation form, and denote:
X → Y   X Granger cause Y
Y → X   Y Granger cause X
X ↔ Y   X Granger cause Y and
  Y Granger cause X (Feedback)

Does X Granger Cause Y?Does Y Granger Cause X?
HypothesisH0: X does not cause Y
H1: X causes Y
H0: Y does not cause X
H1: Y causes X
Unrestricted
Model
Yi = ∑j=1,2,...,mαjYi-j + ∑k=1,2,...,nβkXi-k + εi Xi = ∑j=1,2,...,majYi-j + ∑k=1,2,...,nbkXi-k + ei
Restricted
Model
Yi = ∑j=1,2,...,mαjYi-j + εi Xi = ∑k=1,2,...,nbkXi-k + ei
Test
Statistic
F = ((RSSR-RSSUR)/n) /
(RSSUR/(N-m-n))
F = ((RSSR-RSSUR)/m) /
(RSSUR/(N-m-n))
Granger
Causality
Test
If F≥Fc(n,N-n-m) then reject H0. That is, X does cause Y.    
Otherwise, X does not cause Y.
If F≥Fc(m,N-n-m) then reject H0. That is, Y does cause X.    
Otherwise, Y does not cause X.
Conclusion
X → YReject H0Not Reject H0
Y → XNot Reject H0Reject H0
X ↔ YReject H0Reject H0

The better approach is to estimate the system of two equations as follows:

Yi = ∑j=1,2,...,mαjYi-j + ∑k=1,2,...,nβkXi-k + εi
Xi = ∑j=1,2,...,majYi-j + ∑k=1,2,...,nbkXi-k + ei

If X → Y, all a's equal to 0.
If Y → X, all β's equal to 0.

Example

W. Thurman and M. Fisher, "Chickens, Eggs, and Causality, or Which Came First?" American Agricultural Economics Association, 1988, 237-238. (Paper and Extension, Data)


Copyright © Kuan-Pin Lin
Last updated: January 25, 2010