Distributed Lag Models

Introduction

To keep the model presentation simple, a general distributed lag model is written as:

Y_i = α + ∑_{j=0,1,...,∞} β_jX_i-j + ε_i

Short-Run (Impact) Effect of X: β₀
Long-Run (Equilibirum) Effect of X: β = ∑_{j=0,1,...,∞} β_j < ∞

Define the lag weights as

w_j = β_j / β, j = 0,1,2,...

so that ∑_{j=0,1,...,∞} w_j = 1, and write the model as:

Y_i = α + β ∑_{j=0,1,...,∞} w_jX_i-j + ε_i

Based on lag weights, assuming all w_j have the same sign and |w_j| < 1, the following statistics are useful to characterize the peroid of adjustment to a new equilibrium:

Median Lag: The smallest number of lags m such that
∑_{j=0,1,...,m-1} w_j = 0.5.
Mean Lag: weigthed average of lags
∑_{j=0,1,...,∞} j w_j.

Geometric (Koyck) Lag Models

Suppose an infinite distributed lag model is in the form w_j = (1-λ)λ^j and 0 < λ < 1 is the rate of decline. 0 < (1-λ) < 1 is the rate of adjustment.
Clearly, ∑_{j=0,1,...,∞} w_j = 1, since ∑_{j=0,1,...,∞} λ^j = 1/(1-λ). The model can be written as:

Y_i = α + β(1-λ) ∑_{j=0,1,...,∞} λ^jX_i-j + ε_i

or in the autoregressive form:

Y_i = α(1-λ) + β(1-λ)X_i + λY_i-1 + (ε_i-λε_i-1)

= α₀ + β₀X_i + λY_i-1 + υ_i

This model includes a lagged dependent variable and it is autocorrelated because the error is correlated as υ_i = ε_i-λε_i-1.

Short-Run Effect of X: β₀ = β(1-λ)
Long-Run Effect of X: β = β₀ /(1-λ)
Median Lag: ln(0.5) / ln(λ)
From ∑_{j=0,1,...,m-1} w_j = (1-λ)∑_{j=0,1,...,m-1} λ^j = 0.5,
or 1-λ^m = 0.5. Therefore, m = ln(0.5) / ln(λ).
Mean Lag: λ / (1-λ)
From ∑_{j=0,1,...,∞} j w_j = (1-λ) ∑_{j=0,1,...,∞} j λ^j = (1-λ)λ (∑_{j=0,1,...,∞} λ^j)² = λ /(1-λ).

Three applications of geometric lag models are considered when there involves the use of proxy variable (e.g., unobservable expectation, equilibrium or long-run variable).

Adaptive Expectation Model

Consider a model with one explanatory variable X:

Y_i = α + βX_i^* + ε_i

where X_i^* is an unobservable expected value of X_i satisfying the following:

Adaptive Expectation Hypothesis
X^*_i - X^*_i-1 = (1-λ) (X_i - X^*_i-1), or

X^*_i = (1-λ)X_i + λX^*_i-1

= (1-λ)(X_i+λX_i-1+λ²X_i-2+...)

= (1-λ)∑_{j=0,1,...,∞} λ^jX_i-j

where 0 < (1-λ) < 1 is the coefficient of expectation, and the expectation is imperfect.

Combining the model with adaptive expectation hypothesis, we have:
Y_i = α + β [(1-λ)X_i+λX^*_i-1] + ε_i, or
Y_i = α + β ∑_{j=0,1,...,∞} w_jX_i-j ε_i, with w_j = (1-λ)λ^j, j=0,1,2,...

The model is a standard geometric (Koyck) lag model:
Y_i = α₀ + β₀X_i + λY_i-1 + υ_i, where

α₀ = α(1-λ)

β₀ = β(1-λ)

υ_i = ε_i-λε_i-1

Partial Adjustment Model

Consider a model with one explanatory variable X:

Y_i^* = α + βX_i + ε_i

where Y_i^* is an unobservable equilibrium or long-run desired value of Y_i satisfying the following:

Partial Adjustment Hypothesis
Y_i - Y_i-1 = (1-λ) (Y_i^* - Y_i-1), or

Y_i = (1-λ)Y_i^* + λY_i-1

= (1-λ)Y_i^*+λ[(1-λ)Y_i-1^*+λY_i-2]

= (1-λ)Y_i^*+λ{(1-λ)Y_i-1^*+λ[(1-λ)Y_i-2^*+λY_i-2]}

= ...

= (1-λ)∑_{j=0,1,...,∞} λ^jY_i-j^*

where 0 < (1-λ) < 1 is the coefficient of adjustment, and the adjustment is imperfect.

Combining the model with partial adjustment hypothesis, we have:
Y_i = (1-λ){α(1+λ+λ²+...) + β(X_i+λX_i-1+λ²X_i-2+...) + (ε_i+λε_i-1+λ²ε_i-2+...)} , or
Y_i = α + β ∑_{j=0,1,...,∞} w_jX_i-j + ∑_{j=0,1,...,∞} w_jε_i-j, with w_j = (1-λ)λ^j, j=0,1,2,...

The model is a standard geometric (Koyck) lag model:
Y_i = α₀ + β₀X_i + λY_i-1 + υ_i, where

α₀ = α(1-λ)

β₀ = β(1-λ)

υ_i = ∑_{j=0,1,...,∞} w_jε_i-j - λ∑_{j=0,1,...,∞} w_jε_i-j-1

= (1-λ)(∑_{j=0,1,...,∞} λ^jε_i-j - ∑_{j=0,1,...,∞} λ^j+1ε_i-j-1)

= (1-λ)ε_i

Error Correction Model

Consider a model with one explanatory variable X:

Y_i^* = α + βX_i + ε_i

where Y_i^* is an unobservable equilibrium or long-run desired value of Y_i satisfying the following:

Error Correction Hypothesis
Y_i - Y_i-1 = (1-γ)(Y_i^*-Y_i-1^*) + (1-λ)(Y_i-1^*-Y_i-1), where

Y_i^* - Y_i-1^* = change in the desired values,
Y_i-1^* - Y_i-1 = previous disequilibirum,
0 < (1-γ) < 1,
0 < (1-λ) < 1

If γ = λ, it is the partial adjustment hypothesis. Combining the model with error correction hypothesis, we have

Y_i-Y_i-1 = (1-γ)[β(X_i-X_i-1)+(ε_i-ε_i-1)] + (1-λ)(α+βX_i-1+ε_i-1-Y_i-1), or
Y_i = α₀ + β₀X_i + β₁X_i-1 + λY_i-1 + υ_i, where

α₀ = (1-λ)α

β₀ = (1-γ)β

β₁ = (γ-λ)β

υ_i = (1-γ)ε_i+(γ-λ)ε_i-1

Lagged Dependent Variable

Let X = [X,Y_-1], β = [β,ρ]'.
A time series regression with lagged dependent variables (not limited to the first lag of dependent variable) can be written as:

Y_i = X_iβ + ε_i

= X_iβ + ρY_i-1 + ε_i

= ∑_{j=0,1,...,∞} X_i-jβ^j + ∑_{j=0,1,...,∞} ρ^jε_i-j

The model is a clear violation of Assumption 3 for the classical regression model: E(ε_i|X_i) = E(ε_i|X_i,Y_i-1) ≠ 0. Therefore, the least squares estimators are biased, inconsistent and inefficient.

For a large sample, if X and ε are not contemporarily correlated (that is, plim(X'ε/N) = 0), then the least squares estimators are consistent and asymptotic efficient. However, if X and ε are contemporarily correlated (that is, plim(X'ε/N) ≠ 0), then the least squares estimators are biased, inconsistent and inefficient. Ordinary least squares estimation (OLS) is not recommended for a regression model with lagged dependent variables, in particular for the case of autocorrelation.

Model Estimation

A replacement for X, called Z, may be used to restore the consistency of least sqaures estimation. Z must satisfy the following conditions:

Z is contemporarily independent of ε: plim(Z'ε/N) = 0
plim(Z'X/N) = Cov(Z',X) = Σ_ZX is finite and nonsigular
plim(Z'Z/N) = Var(Z) = Σ_ZZ is finite and positive definite
The number of columns in Z ≥ The number of columns in X

Z is called the instrumental variables for X.

Instrumental Variable Estimation

The estimation consists of two steps as follows:

Formulate a multivariate regression as X = Zδ + u, and estimate the parameter vector of δ as:
d = (Z'Z)^-1Z'X, and
X^p = Zd = Z(Z'Z)^-1Z'X.
Using OLS to estimate the model Y = X^pβ + ε:

b = (X^p'X^p)^-1X^p'Y

= [X'Z(Z'Z)^-1Z'X]^-1 X'Z(Z'Z)^-1Z'Y

It is clear that the selection of instrumental variables is crucial for a successful estimation of the model parameters. In practice, for a lagged dependent variables model, the instrumental variables include the exogenous explanatory variables and their lags as specified by the number of lagged dependent variables used in the regression. Therefore, the instrumental variable estimation (IV) is summarized as:

Define W = Z(Z'Z)^-1Z'X, and note that W'X = W'W.
b = (W'X)^-1W'Y = [X'Z(Z'Z)^-1Z'X]^-1 X'Z(Z'Z)^-1Z'Y
Var(b) = s²(W'X)^-1 = s²[X'Z(Z'Z)^-1Z'X]^-1
where s² = e'e/(N-K) and e = Y - Xb.

Two Special Cases

If Z and X has the same number of columns, then
b = (Z'X)^-1Z'Y
Var(b) = s²(Z'X)^-1
If Z = X, then it is OLS.

Autocorrelation with Lagged Dependent Variable

A regression model with lagged dependent variables is complicated with the autocorrelation problem by model construction. To test for the first-order autocorrelation, Durbin-Watson bounds test is biased toward 2 in the case of lagged dependent variables model. For large sample N, define Durbin-H test statistic as follows:

DH = r[N/(1-N Var(b₁)]^½

= (1-DW/2)[N/(1-N Var(b₁)]^½ ~ normal(0,1)

where r and DW is obtained from the model:
Y_i = X_iβ + ε_i (X_iincludes lagged dependent variables, and not limited to the first-lag)
ε_i = ρε_i-1 + υ_i

r is the estimated first-order correlation coefficient ρ. b₁ is the estimated parameter of the first-order lagged dependent variable. Var(b₁) is the estimated variance of b₁. If Var(b₁) > 1/N, then DH statistic can not be computed. Instead, Breusch-Godfrey test will be useful for testing autocorrelation of any order.

Instrumental Variable Estimation with Autocorrelation

To attain the consistency for model estimation with lagged dependent variables, instrumental variables must be selected and used in the estimation. To correct for autocorrelation, Cochrane-Orcutt or Prais-Winston iterative procedure may be applied to the consistent estimates of the regression residuals and transform data series accordingly. Each iteration consists of two steps of instrumental variables estimation and generalized least squares to correct for autocorrelation. The selection of instrumental variables must include the sufficient lags of exogenous explanatory variables up to the specified order of autocorrelation and lagged dependent variables. Because the use of instrumental variables, Durbin-Watson bounds test statistic is appropriate for testing the first order autocorrelation in this case.

Polynomial (Almon) Lag Models

Consider a finite distributed lag model as follows:

Y_i = α + ∑_j=0,1,...,q β_jX_i-j + ε_i

Since the lag q may be large, we assume that the delay parameter β_j is a p-th order polynominal function for j = 0,1,2,...,q and p ≤ q:

β_j = γ₀ + γ₁j + γ₂j² + ... + γ_pj^p = ∑_k=0,1,...,p γ_k j^k

The model can be written in terms of polynominal parameters as:

Y_i = α + ∑_k=0,1,...,p γ_k Z_ik + ε_i

where Z_ik = ∑_j=0,1,...,q j^k X_i-j, k = 0,1,2,...,p.

Short-Run Effect of X: β₀ = γ₀
Long-Run Effect of X: β = ∑_j=0,1,...,q β_j = ∑_k=0,1,...,p γ_k ∑_j=0,1,...,q j^k
Median Lag and mean lag can be computed based on w_j = β_j / β.

Example

Consider a 3-rd order polynominal lag model of 4 lags (p=3, q=4):

Y_i = α + β₀X_i + β₁X_i-1 + β₂X_i-2 + β₃X_i-3 + β₄X_i-4 + ε_i, with
β_j = γ₀ + γ₁j + γ₂j² + ... + γ₃j³, j = 0,1,2,3,4. That is,

β₀ = γ₀
β₁ = γ₀+ γ₁ + γ₂ + γ₃
β₂ = γ₀+2γ₁+ 4γ₂ +8γ₃
β₃ = γ₀+3γ₁+ 9γ₂+27γ₃
β₄ = γ₀+4γ₁+16γ₂+64γ₃

Equivalently,

Y_i = α + γ₀Z_i0 + γ₁Z_i1 + γ₂Z_i2 + γ₃Z_i3 + ε_i, where

Z_i0 = X_i+X_i-1+X_i-2+X_i-3+X_i-4
Z_i1 = X_i-1+ 2X_i-2+ 3X_i-3+ 4X_i-4
Z_i2 = X_i-1+ 4X_i-2+ 9X_i-3+ 16X_i-4
Z_i3 = X_i-1+ 8X_i-2+ 27X_i-3+64X_i-4

Model Estimation

For a p-th order polynomial q lags model:

β_j = ∑_k=0,1,...,p γ_k j^k, j = 0,1,2,...,q and p ≤ q

We write β = Hγ, or

⌈

|

|

|

⌊

β₀

β₁

β₂

:

β_q

⌉

|

|

|

⌋

=

⌈

|

|

|

⌊

1 0 0 ... 0

1 1 1 ... 1

1 2 4 ... 2^p

: : : : :

1 q q² ... q^p

⌉

|

|

|

⌋

⌈

|

|

|

⌊

γ₀

γ₁

γ₂

:

γ_p

⌉

|

|

|

⌋

End-Point Restriction

Additional end-point (tie-down) restrictions may be imposed:

Left-end restriction: β_-1 = γ₀-γ₁+γ₂-... = 0
Right-end restriction: β_q+1 = γ₀+γ₁(q+1)+γ₂(q+1)²+... = 0
Both-end restriction: β_-1 = 0 and β_q+1 = 0

Least Squares Estimation

From the model Y_i = α + ∑_j=0,1,...,q β_jX_i-j + ε_i, or

Y = α + Xβ + ε

= α + XHγ + ε

= α + Zγ + ε (note Z = XH)

= [1 Z]

⌈

⌊

α

γ

⌉

⌋

+ ε

= Zα + ε

The least squares estimator of α = [α,γ]' = [α,γ₀,γ₁,...,γ_p]' is
a = (Z'Z)^-1Z'Y
Var(a) = s²(Z'Z)^-1
s² = e'e/(N-K) and e = Y - Za
K = p+2 (only one explanatory variable X with p-th order polynomial q lags is considered).

The estimators of the original parameters β = [α,β]' = [α,β₀,β₁,...,β_q]' is obtained from β = Hγ, or
b = Ha
Var(b) = H Var(a) H' = s²H(Z'Z)^-1H'

Application: Granger Causality

An autoregressive distributed lag (ARDL) model can be used to test the causality in Granger sense for a pair of variables. The question is Does X Granger cause Y or Y Granger cause X?

Let X and Y are expressed in deviation form, and denote:

X → Y X Granger cause Y

Y → X Y Granger cause X

X ↔ Y X Granger cause Y and
Y Granger cause X (Feedback)

Does X Granger Cause Y? Does Y Granger Cause X?

Hypothesis H₀: X does not cause Y
H₁: X causes Y H₀: Y does not cause X
H₁: Y causes X

Unrestricted
Model Y_i = ∑_j=1,2,...,mα_jY_i-j + ∑_k=1,2,...,nβ_kX_i-k + ε_i X_i = ∑_j=1,2,...,ma_jY_i-j + ∑_k=1,2,...,nb_kX_i-k + e_i

Restricted
Model Y_i = ∑_j=1,2,...,mα_jY_i-j + ε_i X_i = ∑_k=1,2,...,nb_kX_i-k + e_i

Test
Statistic F = ((RSS_R-RSS_UR)/n) /
(RSS_UR/(N-m-n))
F = ((RSS_R-RSS_UR)/m) /
(RSS_UR/(N-m-n))

Granger
Causality
Test If F≥F_c(n,N-n-m) then reject H₀. That is, X does cause Y.
Otherwise, X does not cause Y. If F≥F_c(m,N-n-m) then reject H₀. That is, Y does cause X.
Otherwise, Y does not cause X.

Conclusion

X → Y Reject H₀ Not Reject H₀

Y → X Not Reject H₀ Reject H₀

X ↔ Y Reject H₀ Reject H₀

	Does X Granger Cause Y?	Does Y Granger Cause X?
Hypothesis	H₀: X does not cause Y H₁: X causes Y	H₀: Y does not cause X H₁: Y causes X
Unrestricted Model	Y_i = ∑_j=1,2,...,mα_jY_i-j + ∑_k=1,2,...,nβ_kX_i-k + ε_i	X_i = ∑_j=1,2,...,ma_jY_i-j + ∑_k=1,2,...,nb_kX_i-k + e_i
Restricted Model	Y_i = ∑_j=1,2,...,mα_jY_i-j + ε_i	X_i = ∑_k=1,2,...,nb_kX_i-k + e_i
Test Statistic	F = ((RSS_R-RSS_UR)/n) / (RSS_UR/(N-m-n))	F = ((RSS_R-RSS_UR)/m) / (RSS_UR/(N-m-n))
Granger Causality Test	If F≥F_c(n,N-n-m) then reject H₀. That is, X does cause Y. Otherwise, X does not cause Y.	If F≥F_c(m,N-n-m) then reject H₀. That is, Y does cause X. Otherwise, Y does not cause X.
Conclusion
X → Y	Reject H₀	Not Reject H₀
Y → X	Not Reject H₀	Reject H₀
X ↔ Y	Reject H₀	Reject H₀

The better approach is to estimate the system of two equations as follows:

Y_i = ∑_j=1,2,...,mα_jY_i-j + ∑_k=1,2,...,nβ_kX_i-k + ε_i
X_i = ∑_j=1,2,...,ma_jY_i-j + ∑_k=1,2,...,nb_kX_i-k + e_i

If X → Y, all a's equal to 0.
If Y → X, all β's equal to 0.

Example

W. Thurman and M. Fisher, "Chickens, Eggs, and Causality, or Which Came First?" American Agricultural Economics Association, 1988, 237-238. (Paper and Extension, Data)

Y_i	= α(1-λ) + β(1-λ)X_i + λY_i-1 + (ε_i-λε_i-1)
	= α₀ + β₀X_i + λY_i-1 + υ_i

X^*_i	= (1-λ)X_i + λX^*_i-1
	= (1-λ)(X_i+λX_i-1+λ²X_i-2+...)
	= (1-λ)∑_{j=0,1,...,∞} λ^jX_i-j

Y_i	= (1-λ)Y_i^* + λY_i-1
	= (1-λ)Y_i^+λ[(1-λ)Y_i-1^+λY_i-2]
	= (1-λ)Y_i^+λ{(1-λ)Y_i-1^+λ[(1-λ)Y_i-2^*+λY_i-2]}
	= ...
	= (1-λ)∑_{j=0,1,...,∞} λ^jY_i-j^*

α₀	= α(1-λ)
β₀	= β(1-λ)
υ_i	= ∑_{j=0,1,...,∞} w_jε_i-j - λ∑_{j=0,1,...,∞} w_jε_i-j-1
	= (1-λ)(∑_{j=0,1,...,∞} λ^jε_i-j - ∑_{j=0,1,...,∞} λ^j+1ε_i-j-1)
	= (1-λ)ε_i

Y_i	= X_iβ + ε_i
	= X_iβ + ρY_i-1 + ε_i
	= ∑_{j=0,1,...,∞} X_i-jβ^j + ∑_{j=0,1,...,∞} ρ^jε_i-j

DH	= r[N/(1-N Var(b₁)]^½
	= (1-DW/2)[N/(1-N Var(b₁)]^½ ~ normal(0,1)

X → Y	X Granger cause Y
Y → X	Y Granger cause X
X ↔ Y	X Granger cause Y and Y Granger cause X (Feedback)