Time Series Analysis I

Nonstationary Time Series

Time Series as Data Generating Process

Covariance Stationary Data Generating Process
Nonstationary Data Generating Process
Integrated Process

Trend in Time Series

Trend Stationary Process
Difference Stationary Process
Spurious Regression

Unit Roots Tests

Augmented Dickey-Fuller t-Test
Augmented Dickey-Fuller F-Test
Unit Roots Test Procedure

Unit Roots Tests with Structural Break

Unit Roots Tests with Exogenous Structural Break
Unit Roots Tests with Endogenous Structural Break

Cointegration Tests

Cointegration Test: The Engle-Granger Approach
Error Correction Model
Cointegration Test: The Johansen Approach

Appendix 1: Stability of a Dynamic Model

Appendix 2: Statistical Tables

Table 1: Critical Values for the Dickey-Fuller Unit Root t-Test Statistics
Table 2: Critical Values for the Dickey-Fuller Unit Root F-Test Statistics
Table 3: Critical Values for the Dickey-Fuller Unit Root t-Test Statistics with One-Time Structural Break
Table 4: Critical Values for the Engle-Granger Cointegration t-Test Statistics Applied to Regression Residuals
Table 5: Critical Values for Unit Root and Cointegration Tests Based on Response Surface Estimates
Table 6: Critical Values for the Johansen's Cointegration Likelihood Ratio Test Statistics

Readings

R. S. Tsay, Chapter 1, 2.7
W. Enders, Chapter 4, 6.
W. H. Green, 7th ed., Chapter 21.
Additional Readings:
- C. Nelson and C. Plosser, "Trends and Random Walks in Macroeconomic Time Series: Some Evidence and Implications," Journal of Monetary Economics 10, 1982, 139-162.
- M. Osterwald-Lenum, "A Note with Quantiles of the Asymptotic Distribution of the Maximum Likelihood Cointegration Rank Test Statistics," Oxford Journal of Economics and Statistics 54, 1992, 461-472.
- P. Perron, "The Great Crash, the Oil Price Shock, and the Unit Root Hypothesis," Econometrica 57, 1989, 1361-1401. (Paper)
- E. Zivot and D. W. K. Andrews, "Further Evidence on the Great Crash, the Oil Price Shock, and the Unit Root Hypothesis," Journal of Business and Economic Statistics 10, 1992, 251-270.

Time Series as Data Generating Process

Economic data series follow random data generating process, stationary or nonstationary, although most of macroeconomic time series are nonstationary. Nonstationarity in time series can be identified with the presence of trend, seasonality, and structural change, etc..

Covariance Stationary Data Generating Process

For each data observation Y₁, Y₂, ...

E(Y_t) = μ
Var(Y_t) = γ₀ = σ²
Cov(Y_t,Y_s) = γ_|t-s|, t ≠ s.

In other words, all the descriptive statistics about the time series: μ, γ₀, γ₁, γ₂, ... are time invariant.

White Noise Process
Y_t ~ ii(μ,σ²) for each observation t = 1,2,...
That is, Y_t is an individually independent data generating process with μ mean and constant variance σ²:
E(Y_t) = μ
Var(Y_t) = γ₀ = σ²
Cov(Y_t,Y_s) = 0, t ≠ s.
Autoregressive Process: AR(p)
Y_t = α + ρ₁Y_t-1 + ρ₂Y_t-2 + ... + ρ_pY_t-p + ε_t
where ρ₁, ρ₂, ..., ρ_p lie outside the unit circle of the p-th order polynomial function of B (ie. 1 - ρ₁B - ρ₂B² - ... - ρ_pB^p = 0); and ε_t ~ ii(0,σ²), t = 1,2,...
Moving Average Process: MA(q)
Y_t = μ - θ₁ε_t-1 - θ₂ε_t-2 - ... - θ_qε_t-q + ε_t
where θ₁, θ₂, ..., θ_p lie outside the unit circle of the q-th order polynomial function of B (ie. 1 - θ₁B - θ₂B² - ... - θ_pB^q = 0); and ε_t ~ ii(0,σ²), t = 1,2,...
Mixed Autoregressive and Moving Average Process: ARMA(p,q)
Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ... + ρ_pY_t-p - θ₁ε_t-1 - θ₂ε_t-2 - ... - θ_qε_t-q + ε_t
where ρ₁, ρ₂, ..., ρ_p lie outside the unit circle of the p-th order polynomial function of B (ie. 1 - ρ₁B - ρ₂B² - ... - ρ_pB^p = 0); θ₁, θ₂, ..., θ_q lie outside the unit circle of the q-th order polynomial function of B (ie. 1 - θ₁B - θ₂B² - ... - θ_pB^q = 0); and ε_t ~ ii(0,σ²), t = 1,2,...

Nonstationary Data Generating Process

Deterministic Trend Process
Y_t = α + βt + ε_t where ε_t ~ ii(0,σ²), t = 1,2,.... Then
E(Y_t) = α + βt
Var(Y_t) = σ²
As t →∞, E(Y_t) →∞. This is the model with linear trend in the mean.
Stochastic Trend (Random Walk) Process
- Random Walk
  Y_t = Y_t-1 + ε_t where ε_t ~ ii(0,σ²), t = 1,2,.... Equivalently,
  Y_t = Y₀ + ∑_i=1,2,...,t ε_i
  Assuming Y₀ exists and finite,
  E(Y_t) = Y₀
  Var(Y_t) = tσ²
  As t →∞, Var(Y_t) →∞.
  This is the model with linear trend in the variance.
- Random Walk with Drift
  Y_t = α + Y_t-1 + ε_t where ε_t ~ ii(0,σ²), t = 1,2,.... Equivalently,
  Y_t = Y₀ + αt + ∑_i=1,2,...,t ε_i
  Assuming Y₀ exists and finite,
  E(Y_t) = Y₀ + αt
  Var(Y_t) = tσ²
  As t →∞, E(Y_t) →∞ and Var(Y_t) →∞.
  This is the model with linear trend in the mean and variance.
- Random Walk with Trend and Drift
  Y_t = α + βt + Y_t-1 + ε_t where ε_t ~ ii(0,σ²), t = 1,2,.... Equivalently,
  Y_t = Y₀ + a t + b t² + ∑_i=1,2,...,t ε_i
  where a = α + β/2 and b = β/2. Assuming Y₀ exists and finite,
  E(Y_t) = Y₀ + a t + b t²
  Var(Y_t) = tσ²
  As t →∞, E(Y_t) →∞ and Var(Y_t) →∞.
  This is the model with exponential trend in the mean and linear trend in the variance.

Integrated Process

A stationary process can be derived from a nonstationary process by differencing the series one or more times. Therefore the original level series is the integration of the differenced series. An integrated process of order d is denoted by I(d) for d=0,1,2,...

That is, Y_t ~ I(d) if Δ^dY_t is stationary, where

ΔY_t = Y_t - Y_t-1,
Δ²Y_t = ΔY_t - ΔY_t-1, ...

For example, if Y_t ~ I(1), then

Y_t = ΔY_t + Y_t-1

= ΔY_t + ΔY_t-1 + Y_t-2 = ...

= ∑_j=0,...,t-1ΔY_t-j with a known Y₀

Similarly, if Y_t ~ I(2), then

ΔY_t-j = ∑_{i=0,...,t-j-1}Δ²Y_t-j-i and

Y_t = ∑_j=0,...,t-1ΔY_t-j

= ∑_j=0,...,t-1∑_{i=0,...,t-j-1}Δ²Y_t-j-i

The white noise process is an integrated process of order 0, or I(0). A random walk process is an integrated process of order 1, or I(1).

Trend in Time Series

Trend Stationary Process

A stationary time series process can be derived by removing the linear or exponential trend from a nonstationary series. It is named trend stationarity.

Y_t = α + βt + ε_t, or
Y_t = α + βt + γt² + ε_t

If ε_t is stationary, then Y_t is a trend stationary process.

Difference Stationary Process

A stationary time series process can be derived by differencing a nonstationary series. It is named difference stationarity. By removing the trend from a difference stationary series does not necessarily achieve trend stationarity (removing trend in the variance). However, a trend stationary process is also difference stationary.

Random Walk
Y_t = Y_t-1 + ε_t, or
ΔY_t = Y_t - Y_t-1 = ε_t
If ε_t is stationary, then Y_t is a difference stationary process.
Random Walk with Drift
Y_t = α + Y_t-1 + ε_t, or
ΔY_t = Y_t - Y_t-1 = α + ε_t
If ε_t is stationary, then Y_t is a difference stationary process.
Random Walk with Trend and Drift
Y_t = α + βt + Y_t-1 + ε_t, or
ΔY_t = Y_t - Y_t-1 = α + βt + ε_t
If ε_t is stationary, then Y_t is a difference stationary process (ΔY_t is a trend stationary process).

Spurious Regression

Most of macroeconomic time series are nonstationary, and may have trend. That is, they are trend nonstationary. By removing the trend, only the trend stationary series are meaningful. By differencing a nonstationary time series doe not establish the trend stationarity, therefore a trend regression on such nonstationary time series has no meaning or spurious. A regression involves trend nonstationary time series may be spurious with the following characteristics:

High R²
Low DW (DW → 0 or ρ → 1)

Unit Roots Tests

Test for a difference stationary process is important since it is the potential source of spurious regression. That is, a trend nonstationay process should be estimated with difference data series, while a trend stationary process can be estimated with level data series.

The purpose of an unit roots test is to statistically test the data generating process for difference stationarity (trend nonstationarity) against trend stationarity. It is a formal test for Random Walk Hypothesis.

Dickey-Fuller (DF) and Augmented Dickey-Fuller (ADF) tests for unit roots (or random walk) depends on:

The Model: I, II, III
The Sample Size: N
The Level of Significance: e

The model error is assumed to be serial uncorrelated and homogeneously distributed. Extentions of DF tests include Said-Dickey on ARMA error structure, and Phillips-Perron on weakly dependent and heterogeneously distributed error structure. Both extentions of unit roots test have the same asymptotic distribution as the Dickey-Fuller distribution.

Augmented Dickey-Fuller t-Test

Simple Hypothesis Testing of Unit Root

Model I
ΔY_t = (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Hypothesis H₀: ρ = 1
H₁: ρ < 1

Test
Statistic t_ρ = (p-1)/se(p)
p is the estimated ρ

Critical
Value ADF_{t_ρ}(I,N,e)

Model II

ΔY_t = α + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Hypothesis H₀: ρ = 1
H₁: ρ < 1 H₀: α = 0, given ρ = 1
H₁: α ≠ 0

Test
Statistic t_ρ = (p-1)/se(p)
p is the estimated ρ t_α = a/se(a)
a is the estimated α

Critical
Value ADF_{t_ρ}(II,N,e) ADF_{t_α}(II,N,e)

Model III

ΔY_t = α + βt + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Hypothesis H₀: ρ = 1
H₁: ρ < 1 H₀: α = 0, given ρ = 1
H₁: α ≠ 0

H₀: β = 0, given ρ = 1
H₁: β ≠ 0

Test
Statistic t_ρ = (p-1)/se(p)
p is the estimated ρ t_α = a/se(a)
a is the estimated α t_β = b/se(b)
b is the estimated β

Critical
Value ADF_{t_ρ}(III,N,e) ADF_{t_α}(III,N,e) ADF_{t_β}(III,N,e)

Augmented Dickey-Fuller F-Test

Joint Hypothesis Testing of Unit Root

Model II
ΔY_t = α + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Hypothesis H₀: α = 0, ρ = 1
H₁: not H₀

Restricted
Model ΔY_t = ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Test
Statistic F_α,ρ = (RSS_r-RSS_ur)/2 / RSS_ur/(N-J-2)

Critical
Value ADF_{F_α,ρ}(II,N,e)

Model III

ΔY_t = α + βt + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Hypothesis H₀: α = 0, β = 0, ρ = 1
H₁: not H₀ H₀: β = 0, ρ = 1
H₁: not H₀

Restricted
Model ΔY_t = ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t ΔY_t = α + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t

Test
Statistic F_α,β,ρ = (RSS_r-RSS_ur)/3 / RSS_ur/(N-J-3) F_β,ρ = (RSS_r-RSS_ur)/2 / RSS_ur/(N-J-3)

Critical
Value ADF_{F_α,β,ρ}(III,N,e) ADF_{F_β,ρ}(III,N,e)

Unit Roots Test Procedure

Step 1:

Estimate Model III:
ΔY_t = α + βt + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
Test ρ = 1 using ADF_{t_ρ} distribution:
If ρ < 1 the stop (no unit root) else continue
Test β = 0 given ρ = 1, using ADF_{t_β} or ADF_{F_β,ρ} distribution:
Given ρ = 1, if β ≠ 0, then estimate ΔY_t = α + βt + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t and test β = 0 using t-distribution:
- If β ≠ 0 then go back to Step 1 and test ρ = 1 using t-distribution:
  If ρ < 1 then stop (no unit root) else conclude (unit root)!
else continue

Step 2:

Estimate Model II:
ΔY_t = α + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
Test ρ = 1 using using ADF_{t_ρ} distribution:
If ρ < 1 the stop (no unit root) else continue
Test α = 0 given ρ = 1, using ADF_{t_α} or ADF_{F_α,ρ} distribution:
Given ρ = 1, if α ≠ 0 then estimate ΔY_t = α + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t and test α = 0 using t-distribution:
- If α ≠ 0 then go back to Step 2 and test ρ = 1 using t-distribution:
  if ρ < 1 then stop (no unit root) else conclude (unit root)!
else continue

Step 3:

Estimate Model I:
ΔY_t = (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
Test ρ = 1 using ADF_{t_ρ} distribution:
If ρ < 1 the stop (no unit root) else conclude (unit root)!

Alternative Representation of Unit Roots Tests

Model I
ΔY_t = (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
If J=0, ΔY_t = (ρ-1)Y_t-1 + ε_t. That is, Y_t = ρY_t-1 + ε_t
If J=1, ΔY_t = (ρ-1)Y_t-1 + ρ₁ΔY_t-1 + ε_t. That is, Y_t = (ρ+ρ₁)Y_t-1 - ρ₁Y_t-2 + ε_t
If J=2, Y_t = (ρ+ρ₁)Y_t-1 + (ρ₂-ρ₁)Y_t-2 - ρ₂Y_t-3 + ε_t
...
In general, Y_t = (ρ+ρ₁)Y_t-1 + (ρ₂-ρ₁)Y_t-2 + (ρ₃-ρ₂)Y_t-3 + ... + (ρ_J-ρ_J-1)Y_t-J - ρ_JY_t-(J+1) + ε_t
That is, Y_t = π₁Y_t-1 + π₂Y_t-2 + π₃Y_t-3 + ... + π_JY_t-J + π_J+1Y_t-(J+1) + ε_t
where π₁=ρ+ρ₁, π_j=ρ_j-ρ_j-1, j=2,...,J, π_J+1=-ρ_J.
Because ∑_j=1,...J+1π_j = ρ, test for unit root ρ = 1 is equivalent to test ∑_j=1,...J+1π_j = 1.
Model II
ΔY_t = α + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
is equivalent to
Y_t = α + π₁Y_t-1 + π₂Y_t-2 + π₃Y_t-3 + ... + π_JY_t-J + π_J+1Y_t-(J+1) + ε_t
where π₁=ρ+ρ₁, π_j=ρ_j-ρ_j-1, j=2,...,J, π_J+1=-ρ_J, and ∑_j=1,...J+1π_j = ρ
Model III
ΔY_t = α + βt + (ρ-1)Y_t-1 + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
is equivalent to
Y_t = α + βt + π₁Y_t-1 + π₂Y_t-2 + π₃Y_t-3 + ... + π_JY_t-J + π_J+1Y_t-(J+1) + ε_t
where π₁=ρ+ρ₁, π_j=ρ_j-ρ_j-1, j=2,...,J, π_J+1=-ρ_J, and ∑_j=1,...J+1π_j = ρ

Alternative Tests for Unit Roots

Phillips-Perron (1987) Test
Based on model selection criteria, ADF tests use lagged differenced terms to filter serial correlation in the test equation. The alternative Phillips-Perron unit root tests use Newey-West robust standard errors to account for serial correlation. Two statistics are computed: (1) T(p-1), (2) (p-1)/se^*(p), where p is the OLS estimate of ρ and se^*(p) is the estimated robust standard error of p, from the following three random walk model specifications:
1. Y_t = ρY_t-1 + ε_t
2. Y_t = α + ρY_t-1 + ε_t
3. Y_t = α + βt + ρY_t-1 + ε_t
Phillips-Perron test statistics can be viewed as Dickey-Fuller test statistics that have been made robust to serial correlation by estimating the Newey-West heteroscedasticity autocorrelation consistent variance-covariance matrix. Phillips-Perron test statistics have the same distribution as the Dickey-Fuller test statistics. Therefore, the ADF critical values can be used to carry out the test.
ERS (Elliott, Rothenberg, and Stock, 1996) DF-GLS Test
With the presence of deterministic trend in the test equation, it has been argued that ADF unit root tests had weak power (that is, it becomes more difficult to reject the null [incorrect] hypothesis of unit roots). In other words, the drift or the trend is not part of data generating process. It is necessary to distinguish the effects of unit roots from the deterministic trend. Elliott, Rothenberg, and Stock suggested to remove the trend or drift first using GLS, then perform unit roots test on the filtered data series. There are evidences that ERS's DF-GLS test has significant greater power than the ADF test.
The idea of DF-GLS test is to estimate the trend of the data series {Y_t} by GLS: a + bt. Then the filtered series is defined by: Y_t^* = Y_t - (a + bt). Finally, we perform an ADF test on the filtered data series {Y_t^*} using tabulated critical values (see Elliott, Rothenberg, and Stock, 1996).

Unit Roots Tests with Structural Break

The classical unit roots tests described above tend to not rejecting the unit root (or has low power) of a time series with changing mean or breaking trend. Let T_B be the the break time of the sample period T, and define λ = T_B/T.

Exogenous Structural Break

If the breakpoint λ is fixed (or given a prior), based on Model III (random walk with drift and trend), Perron [1989] considered three versions of hypothesis testing for unit roots and structural change:

Model IIIa
H₀: Y_t = α + Y_t-1 + θD(T_B)_t + ε_t
H₁: Y_t = α₁ + βt + (α₂-α₁)DU_t + ε_t

Model IIIb
H₀: Y_t = α₁ + Y_t-1 + (α₁-α₂)DU_t + ε_t
H₁: Y_t = α + β₁t + (β₂-β₁)DT_t + ε_t

Model IIIc
H₀: Y_t = α₁ + Y_t-1 + θD(T_B)_t + (α₁-α₂)DU_t + ε_t
H₁: Y_t = α₁ + β₁t + (α₂-α₁)DU_t + (β₂-β₁)DT_t + ε_t

Where ε_t is stationary and possibly prescribed by an ARMA(p,q) process, and

D(T_B)_t = 1, if t = T_B+1

0 otherwise

DU_t = 1, if t>T_B

0 otherwise

DT_t = t-T_B, if t>T_B

0 otherwise

Then the corresponding augmented testing equations are:

Model IIIa
ΔY_t = α + βt + θD(T_B)_t + δDU_t + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Model IIIb
ΔY_t = α + βt + δDU_t + γDT_t + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Model IIIc
ΔY_t = α + βt + θD(T_B)_t + δDU_t + γDT_t + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

For each version of testing equation, at the location of breakpoint λ, t statistic of the lag parameter ρ or t_ρ(λ) is compared with the critical values of the asymptotic distribution of this statistic. We reject the null hypothesis of unit root if the computed t_ρ(λ) is less than the critical values for a given λ.

Endogenous Structural Break

If the breakpoint λ is unknown and must be estimated, the null hypothesis is:

Y_t = α + Y_t-1 + ε_t

Therefore three versions of unit roots test are:

Model IIIa
H₀: Y_t = α + Y_t-1 + ε_t
H₁: Y_t = α₁ + βt + (α₂-α₁)DU_t(λ) + ε_t

Model IIIb
H₀: Y_t = α + Y_t-1 + ε_t
H₁: Y_t = α + β₁t + (β₂-β₁)DT_t(λ) + ε_t

Model IIIc
H₀: Y_t = α + Y_t-1 + ε_t
H₁: Y_t = α₁ + β₁t + (α₂-α₁)DU_t(λ) + (β₂-β₁)DT_t(λ) + ε_t

The corresponding augmented testing equations are:

Model IIIa
ΔY_t = α + βt + δDU_t(λ) + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Model IIIb
ΔY_t = α + βt + γDT_t(λ) + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Model IIIc
ΔY_t = α + βt + δDU_t(λ) + γDT_t(λ) + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

We write the dummy variables DU_t and DT_t to depend on the breakpoint λ, which is the outcome of fitting Y_t to a certain trend stationary process with a one-time structural break at an unknown point of time. The purpose is to estimate the breakpoint that gives the most weight to the trend stationary alternative. In other words, λ^* is chosen to minimize the one-sided t statistic for testing the lag parameter ρ = 1. The estimate breakpoint λ^* and minimum t statistic are obtained as follows:

For Model IIIa, IIIb, IIIc, estimate the test equation for all possible values of λ in (0,1). That is, from T_B=2 to T_B=T-1, run T-2 regressions and collect all the t statistics for testing ρ=1. We note that, the augmented lags J used in the test equation may be different for each λ=T_B/T.

Let t_ρ^* = min_{λ in (0,1)}{t_ρ(λ)}, and λ^* is the estimated breakpoint corresponds to this minimum t statitic. Zivot and Andrews [1992] tabulates the critical values of the asymptotic distribution for t_ρ^*. The computed t_ρ^* is used to compared with these critical values. We reject the null hypothesis of unit root if the computed t_ρ^* is less than the critical value for a given level of significance.

Cointegration Tests

Consider a set of M variables Z_t (a 1xM vector). If Z_t ~ I(1), the column-wise linear combination of Z_t is again usually I(1). Are there any suituations that one or more of such linear combinations will result a stationary process or I(0)? In other words, does the set of variables Z_t cointegrate? A regression relationship involving Z_t will only be meaningful or not spurious if the variables in Z_t are cointegrated.

Cointegration Test: The Engle-Granger Approach

Without loss of generality, let Y_t = Z_t1 and X_t = [Z_t2, ..., Z_tM]. Consider the following regression equation:

Y_t = α + X_tβ + ε_t

In general, if Y_t, X_t ~ I(1), then ε_t ~ I(1). If ε_t can be shown to be I(0), then the set of variables [Y_t, X_t] cointergrates, and the vector [1 -β]' (or any multiple of it) is called a cointegrating vector. Depending on the number of variables M, there are up to M-1 linearly independent cointegrating vectors. The number of linearly independent cointegrating vectors that exists in [Y_t, X_t] is called cointegrating rank.

A simple way to test for cointegration is to apply unit roots test on the residuals of the above regression equation. Let

N = Number of usable sample observations;
K = Number of variables in [Y_t,X_t] for cointegration test

The unit roots test for the regression residuals, or the cointegration test, is formulated as follows:

Δε_t = (ρ-1)ε_t-1 + u_t

or with augmented lags:

Δε_t = (ρ-1)ε_t-1 + ∑_j=1,2,...,J ρ_t-jΔε_t-j + u_t

Hypothesis H₀: ρ = 1
H₁: ρ < 1

Test
Statistic t_ρ = (p-1)/se(p)
where p is the estimate of ρ

Critical
Value ADF(I,N,e)

If we can reject the null hypothesis of unit root on the residuals ε_t, we can say that variables [Y_t, X_t] in the regression equation are cointegrated. The cointegrating regression model may be generalized to include trend as follows:

Y_t = α + γt + X_tβ + ε_t

Notice that the trend in the cointegreating regression equation may be the result of combined drifts in X and/or Y.

J. MacKinnon's table of critical values of cointegration tests for both cointegrating regression with and without trend (named Model 2 and Model 3, respectively) is provided in Table 5. It is based on simulation experiments by means of response surface regression in which critical values depend on the sample size. Therefore, this table is easier and more flexible to use than the original EG and AEG distributions.

Error Correction Model

When Y_t and X_t are cointegrated, we have

Y_t = α + X_tβ + ε_t
Δε_t = (ρ-1)ε_t-1 + u_t

where ρ < 1 and u_t is stationary. Therefore the short-run dynamics of the model is

ΔY_t = ΔX_tβ + Δε_t

= ΔX_tβ + (ρ-1)ε_t-1 + u_t

= ΔX_tβ + (ρ-1)(Y_t-1-α-X_t-1β) + u_t

This is exactly the Error Correction Model.

Cointegration Test: The Johansen Approach

Given a set of M variables Z_t=[Z_t1, Z_t2, ..., Z_tM], and considering their simultanenity, Johansen's FIML (Full Information Maximum Likelihood) approach of cointegration test is derived from

A VAR (Vector Autoregression) System Model Representation
FIML Estimation of the Linear Equations System
Cannonical Correlations Analysis

Similar to the random walk (unit roots) hypothsis testing for a single variable with argumented lags, we write a VAR(p) linear system for the M variables Z_t:

Z_t = Z_t-1Π₁ + Z_t-2Π₂ + ... + Z_t-pΠ_p + Π₀ + U_t

where Π_j, j=1,2,...M, are the MxM parameter matrices, Π₀ is a 1xM drift or constant vector, and the 1xM error vector U_t ~ normal(0,Σ) with a constant matrix Σ = Var(U_t) = E(U_t'U_t) denoting the covariance matrix across M variables.

The VAR(p) system can be transformed using the difference series of the variables, resemble the error correction model representation, as follows:

ΔZ_t = ΔZ_t-1γ₁ + ΔZ_t-2γ₂ + ... + ΔZ_t-(p-1)γ_p-1 + Z_t-1Π + γ₀ + U_t

where Π = ∑_j=1,2,...,pΠ_j - I, γ₁ = Π₁ - Π - I , γ₂ = Π₂ + γ₁, ..., and γ₀ = Π₀ for notational convenience.

If Z_t ~ I(1), then ΔZ_t ~ I(0). In order to have the variables in Z_t cointegrated, we must have U_t ~ I(0). That is, we must show the term Z_t-1Π ~ I(0). By definition of cointegration, the parameter matrix Π must contains 0 < r < M linearly independent cointegrating vetors such that Z_tΠ ~ I(0). Therefore, the cointegration test amounts to check that Rank(Π) = r > 0.

If Rank(Π) = r, we may impose the parameter restrictions Π = BA' where A and B are Mxr matrices. Since A is a Mxr rank matrix, we can rewrite the constant γ₀ = μA'+γ, where μ is 1xr and γ is 1xM. γ is orthogonal to μA'. That is, μA'γ = 0. Therefore,

ΔZ_t = ΔZ_t-1γ₁ + ΔZ_t-2γ₂ + ... + ΔZ_t-(p-1)γ_p-1 + γ + (Z_t-1B+μ)A' + U_t

Given the existence of the constant vector γ₀ = μA'+γ, there can be up to M-r random walks or the drift trends. Such common trends in the variables may be removed in the case of Model II below. We consider the following three models:

Model I: VAR(p) representation without constant vector: μ = γ = 0
ΔZ_t = ΔZ_t-1γ₁ + ΔZ_t-2γ₂ + ... + ΔZ_t-(p-1)γ_p-1 + Z_t-1BA' + U_t
Model II: VAR(p) representation with restricted constant vector (or trend removed drift only): γ = 0
ΔZ_t = ΔZ_t-1γ₁ + ΔZ_t-2γ₂ + ... + ΔZ_t-(p-1)γ_p-1 + (Z_t-1B+μ)A' + U_t
Model III: VAR(p) representation with constant vector (drift trend)
ΔZ_t = ΔZ_t-1γ₁ + ΔZ_t-2γ₂ + ... + ΔZ_t-(p-1)γ_p-1 + γ + (Z_t-1B+μ)A' + U_t

For model estimation of the above VAR(p) system, where U_t ~ normal(0,Σ), we derive the log-likelihood function for Model III:

ll(γ₁,γ₂,..., γ_p-1,γ₀,Π,Σ) = - MN/2 ln(2π) - N/2 ln|det(Σ)| - ½ ∑_t=1,2,...,NU_tΣ^-1U_t'

Since the maximum likelihood estimate of Σ is U'U/N, the concentrated log-likelihood function is written as:

ll*(γ₁,γ₂,..., γ_p-1,γ₀,Π) = - NM/2 (1+ln(2π)-ln(N)) - N/2 ln|det(U'U)|

The actual maximum likelihood estimation can be simplied by considering the following two auxilary regressions:

ΔZ_t = ΔZ_t-1Φ₁ + ΔZ_t-2Φ₂ + ... + ΔZ_t-(p-1)Φ_p-1 + Φ₀ + W_t
Z_t-1 = ΔZ_t-1Ψ₁ + ΔZ_t-2Ψ₂ + ... + ΔZ_t-(p-1)Ψ_p-1 + Ψ₀ + V_t

Then γ_j = Φ_j-Ψ_jΠ, for j=0,1,2,...,p-1, and U_t = W_t - V_tΠ. If Φ₀ = Ψ₀ = 0, then γ₀ = 0 implying no drift in the VAR(p) representation. However, γ₀ = 0 will need only the restriction that Φ₀ = Ψ₀Π.

Returning to the concentrated log-likelihood function, it is now written as

ll*(W(Φ₁,Φ₂,...,Φ_p-1,Φ₀), V(Ψ₁,Ψ₂,...,Ψ_p-1,Ψ₀),Π)
= - NM/2 (1+ln(2π)-ln(N)) - N/2 ln|det((W-VΠ)'(W-VΠ))|

Maximizing the above concentrated log-likelihood function is equivalent to minimize the sum-of-squares term det((W-VΠ)'(W-VΠ)). Conditional to W(Φ₁,Φ₂,...,Φ_p-1,Φ₀) and V(Ψ₁,Ψ₂,...,Ψ_p-1,Ψ₀), the least squares estimate of Π is (V'V)^-1V'W. Thus,

det((W-VΠ)'(W-VΠ))
= det(W(I-V(V'V)^-1V')W')
= det((W'W)(I-(W'W)^-1(W'V)(V'V)^-1(V'W))
= det(W'W) det(I-(W'W)^-1(W'V)(V'V)^-1(V'W))
= det(W'W) (∏_i=1,2,...,M(1-λ_i))

where λ₁, λ₂, ..., λ_M are the ascending ordered eigenvalues of the matrix (W'W)^-1(W'V)(V'V)^-1(V'W). Therefore the resulting double concentrated log-likelihood function (concentrating on both Σ and Π) is

ll**(W(Φ₁,Φ₂,...,Φ_p-1,Φ₀), V(Ψ₁,Ψ₂,...,Ψ_p-1,Ψ₀))
= - NM/2 (1+ln(2π)-ln(N)) - N/2 ln|det(W'W)| - N/2 ∑_i=1,2,...,Mln(1-λ_i)

Given the parameter constraints that there are 0 < r < M cointegrating vectors, that is Π = -BA' where A and B are Mxr matrices, the restricted concentrated log-likelihood function is similarily derived as follows:

ll_r**(W(Φ₁,Φ₂,...,Φ_p-1,Φ₀), V(Ψ₁,Ψ₂,...,Ψ_p-1,Ψ₀))
= - NM/2 (1+ln(2π)-ln(N)) - N/2 ln|det(W'W)| - N/2 ∑_i=1,2,...,rln(1-λ_i)

Therefore, with the degree of freedom M-r, the likelihood ratio test statistic for at least r cointegrating vectors is

-2(ll_r** - ll**) = -N ∑_{i=r+1,2,...,M}ln(1-λ_i)

Similarly the likelihood ratio test statistic for r cointegrating vectors against r+1 vectors is

-2(ll_r** - ll_r+1**) = -N ln(1-λ_r+1)

A more general form of the likelihood ratio test statistic for r1 cointegrating vectors against r2 vectors (0 ≤ r1 < r2 ≤ M) is

-2(ll_r1** - ll_r2**) = -N ∑_{i=r1+1,2,...,r2}ln(1-λ_i)

The following table summarizes the two popular cointegration test statistics: Eigenvalue Test Statistic λ_max(r) and Trace Test Statistic λ_trace(r). For the case of r = 0, they are the tests for no cointegration.

Cointegrating
Rank (r) H₀: r1 = r
H₁: r2 = r+1 H₀: r1 = r
H₁: r2 = M

0 -N ln(1-λ₁) -N ∑_i=1,2,...,Mln(1-λ_i)

1 -N ln(1-λ₂) -N ∑_i=2,3,...,Mln(1-λ_i)

... ... ...

M-1 -N ln(1-λ_M) -N ln(1-λ_M)

Critical
Value λ_max(r) λ_trace(r)

Appendix 1: Stability of a Dynamic Model

The stability of a dynamic model hinges on the characteristic equation for the autoregressive part of the model. The roots of the characteristic equation:

1 - ρ₁B - ρ₂B² - ... - ρ_pB^p = 0

must be great than 1 in absoulte value for the model to be stable.

For example, consider the AR(1) model. The characteristic equation is 1 - ρ₁B = 0. The single root of this equation is B = 1/ρ₁, which is greater than 1 in absolute value if |ρ₁| < 1. Similarly, for an AR(2) model, the two roots of the characteristic equation 1 - ρ₁B - ρ₂B² = 0 are B₁,B₂ = [ρ₁±√(ρ₁²+4ρ₂)]/2. Therefore, the stability conditions are:

ρ₁+ρ₂ < 1
ρ₂-ρ₁ < 1
|ρ₂| < 1.

A more general AR(p) model may be represented by VAR(1):

⌈

|

|

⌊

Y_t

Y_t-1

:

Y_t-p+1

⌉

|

|

⌋

=

⌈

|

|

⌊

α

0

:

0

⌉

|

|

⌋

+

⌈

|

|

⌊

ρ₁ ρ₂ .. ρ_p

1 0 .. 0

: : : :

0 .. 1 0

⌉

|

|

⌋

⌈

|

|

⌊

Y_t-1

Y_t-2

:

Y_t-p

⌉

|

|

⌋

+

⌈

|

|

⌊

ε_t

0

:

0

⌉

|

|

⌋

That is, Y_t = α + ρ Y_t-1 + ε_t

By successive substitution, we obtain Y_t = α + ρα + ρ²α + ... (so that the equilibrium Y_∞ = (I-ρ)^-1α).

The roots of the asymmetric matrix ρ may be complex in the form a±bi, where i=√(-1). The stability requires that all the roots of ρ must be less than 1 in absolute value. That is, |a+bi| = √(a²+b²) < 1.

The unit circle refers to the two-dimentional set of values of a and b defined by a²+b²=1, which defines a circle centered at the origin with radius 1. Therefore, for a stable dynamic model, the roots of the characteristic equation

1 - ρ₁B - ρ₂B² - ... - ρ_pB^p = 0

which are the the reciprocals of the characteristic roots of the matrix ρ must lie outside the unit circle.

Appendix 2: Statistical Tables

Table 1: Critical Values for the Dickey-Fuller Unit Root t-Test Statistics

                        Probabilty to the Right of Critical Value
Model Statistic N    99%  97.5%    95%    90%    10%     5%   2.5%     1%
   I   ADF_{t_ρ}   25  -2.66  -2.26  -1.95  -1.60   0.92   1.33   1.70   2.16
              50  -2.62  -2.25  -1.95  -1.61   0.91   1.31   1.66   2.08
             100  -2.60  -2.24  -1.95  -1.61   0.90   1.29   1.64   2.03
             250  -2.58  -2.23  -1.95  -1.61   0.89   1.29   1.63   2.01
             500  -2.58  -2.23  -1.95  -1.61   0.89   1.28   1.62   2.00
            >500  -2.58  -2.23  -1.95  -1.61   0.89   1.28   1.62   2.00
  II   ADF_{t_ρ}   25  -3.75  -3.33  -3.00  -2.62  -0.37   0.00   0.34   0.72
              50  -3.58  -3.22  -2.93  -2.60  -0.40  -0.03   0.29   0.66
             100  -3.51  -3.17  -2.89  -2.58  -0.42  -0.05   0.26   0.63
             250  -3.46  -3.14  -2.88  -2.57  -0.42  -0.06   0.24   0.62
             500  -3.44  -3.13  -2.87  -2.57  -0.43  -0.07   0.24   0.61
            >500  -3.43  -3.12  -2.86  -2.57  -0.44  -0.07   0.23   0.60
 III   ADF_{t_ρ}   25  -4.38  -3.95  -3.60  -3.24  -1.14  -0.80  -0.50  -0.15
              50  -4.15  -3.80  -3.50  -3.18  -1.19  -0.87  -0.58  -0.24
             100  -4.04  -3.73  -3.45  -3.15  -1.22  -0.90  -0.62  -0.28
             250  -3.99  -3.69  -3.43  -3.13  -1.23  -0.92  -0.64  -0.31
             500  -3.98  -3.68  -3.42  -3.13  -1.24  -0.93  -0.65  -0.32
            >500  -3.96  -3.66  -3.41  -3.12  -1.25  -0.94  -0.66  -0.33

                        Probabilty to the Right of Critical Value
Model Statistic N     1%   2.5%     5%    10% (Symmetric Distribution, given ρ = 1)
  II   ADF_{t_α}   25   3.14   2.97   2.61   2.20
              50   3.28   2.89   2.56   2.18
             100   3.22   2.86   2.54   2.17
             250   3.19   2.84   2.53   2.16
             500   3.18   2.83   2.52   2.16
            >500   3.18   2.83   2.52   2.16
 III   ADF_{t_α}   25   4.05   3.59   3.20   2.77
              50   3.87   3.47   3.14   2.78
             100   3.78   3.42   3.11   2.73
             250   3.74   3.39   3.09   2.73
             500   3.72   3.38   3.08   2.72
            >500   3.71   3.38   3.08   2.72
 III   ADF_{t_β}   25   3.74   3.25   2.85   2.39
              50   3.60   3.18   2.81   2.38
             100   3.53   3.14   2.79   2.38
             250   3.49   3.12   2.79   2.38
             500   3.48   3.11   2.78   2.38
            >500   3.46   3.11   2.78   2.38

Table 2: Critical Values for the Dickey-Fuller Unit Root F-Test Statistics

                        Probabilty to the Right of Critical Value
Model Statistic N    1%    2.5%     5%    10%    90%    95%  97.5%    99%
  II   ADF_{F_α,ρ}  25   7.88   6.30   5.18   4.12   0.65   0.49   0.38   0.29
              50   7.06   5.80   4.86   3.94   0.66   0.50   0.30   0.29
             100   6.70   5.57   4.71   3.86   0.67   0.50   0.30   0.29
             250   6.52   5.45   4.63   3.81   0.67   0.51   0.39   0.30
             500   6.47   5.41   4.61   3.79   0.67   0.51   0.39   0.30
            >500   6.43   5.38   4.59   3.78   0.67   0.51   0.40   0.30
 III   ADF_{F_α,β,ρ} 25   8.21   6.75   5.68   4.67   1.10   0.89   0.75   0.61
              50   7.02   5.94   5.13   4.31   1.12   0.91   0.77   0.62
             100   6.50   5.59   4.88   4.16   1.12   0.92   0.77   0.63
             250   6.22   5.40   4.75   4.07   1.13   0.92   0.77   0.63
             500   6.15   5.35   4.71   4.05   1.13   0.92   0.77   0.63
            >500   6.09   5.31   4.68   4.03   1.13   0.92   0.77   0.63
 III   ADF_{F_β,ρ}  25  10.61   8.65   7.24   5.91   1.33   1.08   0.90   0.74
              50   9.31   7.81   6.73   5.61   1.37   1.11   0.93   0.76
             100   8.73   7.44   6.49   5.47   1.38   1.12   0.94   0.76
             250   8.43   7.25   6.34   5.39   1.39   1.13   0.94   0.76
             500   8.34   7.20   6.30   5.36   1.39   1.13   0.94   0.76
            >500   8.27   7.16   6.25   5.34   1.39   1.13   0.94   0.77

Table 3: Critical Values for the Dickey-Fuller Unit Root t-Test Statistics with One-Time Structural Break

Model:

Model IIIa
ΔY_t = α + βt + δDU_t(λ) + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Model IIIb
ΔY_t = α + βt + γDT_t(λ) + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Model IIIc
ΔY_t = α + βt + δDU_t(λ) + γDT_t(λ) + (ρ-1)Y_t-1 + ∑_j=1,2,...,Jρ_jΔY_t-j + ε_t

Where λ = T_B/T (T is the sample size and T_B is the break point), and

DU_t(λ) = 1, if t>T_B

0 otherwise

DT_t(λ) = t-T_B, if t>T_B

0 otherwise

λ^* is the estimated breakpoint which minimizes the t statistic t_ρ(λ) for testing the unit root over the range of 0 < λ < 1.

Source:
Perron (1989), Zivot and Andrews (1992).

                        Probabilty to the Right of Critical Value
Model Statistic λ    99%  97.5%    95%    90%    50%    10%     5%   2.5%     1%
 IIIa  ADF_{t_ρ(λ)}  λ^*  -5.34  -5.02  -4.80  -4.58  -3.75  -2.99  -2.77  -2.56  -2.32
              0.1  -4.30  -3.93  -3.68  -3.40  -2.35  -1.38  -1.09  -0.78  -0.46
              0.2  -4.39  -4.08  -3.77  -3.47  -2.45  -1.45  -1.14  -0.90  -0.54
              0.3  -4.39  -4.03  -3.76  -3.46  -2.42  -1.43  -1.13  -0.83  -0.51
              0.4  -4.34  -4.01  -3.72  -3.44  -2.40  -1.26  -0.88  -0.55  -0.21
              0.5  -4.32  -4.01  -3.76  -3.46  -2.37  -1.17  -0.79  -0.49  -0.15
              0.6  -4.45  -4.09  -3.76  -3.47  -2.38  -1.28  -0.92  -0.60  -0.26
              0.7  -4.42  -4.07  -3.80  -3.51  -2.45  -1.42  -1.10  -0.82  -0.50
              0.8  -4.33  -3.99  -3.75  -3.46  -2.43  -1.46  -1.13  -0.89  -0.57
              0.9  -4.27  -3.97  -3.69  -3.38  -2.39  -1.37  -1.04  -0.74  -0.47
 IIIb  ADF_{t_ρ(λ)}  λ^*  -4.93  -4.67  -4.42  -4.11  -3.23  -2.48  -2.31  -2.17  -1.97
              0.1  -4.27  -3.94  -3.65  -3.36  -2.34  -1.35  -1.04  -0.78  -0.40
              0.2  -4.41  -4.08  -3.80  -3.49  -2.50  -1.48  -1.18  -0.87  -0.52
              0.3  -4.51  -4.17  -3.87  -3.58  -2.54  -1.59  -1.27  -0.97  -0.69
              0.4  -4.55  -4.20  -3.94  -3.66  -2.61  -1.69  -1.37  -1.11  -0.75
              0.5  -4.55  -4.20  -3.96  -3.68  -2.70  -1.74  -1.40  -1.18  -0.82
              0.6  -4.57  -4.20  -3.95  -3.66  -2.61  -1.71  -1.36  -1.11  -0.78
              0.7  -4.51  -4.13  -3.85  -3.57  -2.55  -1.61  -1.28  -0.97  -0.67
              0.8  -4.38  -4.07  -3.82  -3.50  -2.47  -1.49  -1.16  -0.87  -0.54
              0.9  -4.26  -3.96  -3.68  -3.35  -2.33  -1.34  -1.04  -0.77  -0.43
 IIIc  ADF_{t_ρ(λ)}  λ^*  -5.57  -5.30  -5.08  -4.82  -3.98  -3.25  -3.06  -2.91  -2.72
              0.1  -4.38  -4.01  -3.75  -3.45  -2.38  -1.44  -1.11  -0.82  -0.45
              0.2  -4.65  -4.32  -3.99  -3.66  -2.67  -1.60  -1.27  -0.98  -0.67
              0.3  -4.78  -4.46  -4.17  -3.87  -2.75  -1.78  -1.46  -1.15  -0.81
              0.4  -4.81  -4.48  -4.22  -3.95  -2.88  -1.91  -1.62  -1.35  -1.04
              0.5  -4.90  -4.53  -4.24  -3.96  -2.91  -1.96  -1.69  -1.43  -1.07
              0.6  -4.88  -4.49  -4.24  -3.95  -2.87  -1.93  -1.63  -1.37  -1.08
              0.7  -4.75  -4.44  -4.18  -3.86  -2.77  -1.81  -1.47  -1.17  -0.79
              0.8  -4.70  -4.31  -4.04  -3.69  -2.67  -1.63  -1.29  -1.04  -0.64
              0.9  -4.41  -4.10  -3.80  -3.46  -2.41  -1.44  -1.12  -0.80  -0.50

Table 4: Critical Values for the Engle-Granger Cointegration t-Test Statistics Applied to Regression Residuals

Model:
Y_t = α + X_t β + ε_t
Δε_t = (ρ-1)ε_t-1 + ∑_j=1,2,...,J ρ_t-jΔε_t-j + u_t
K = Numbers of variables in the cointegration tests, i.e. [Y_t, X_t].
t = 1,2,...,N (500).

Model 2: E(Y_t) = E(X_t) = 0 (both X and Y have no drift)
Model 2a: E(X_t) ≠ 0 (at least one variable in X has drift)
Model 3: E(Y_t) ≠ 0 but E(X_t) = 0 (only Y has drift)

Note:
For the case of two variables in Model 2a, X is trended but Y is not. It is asymptotically equivalent to ADF Unit Root Test for Model III (see Table 1, ADF_{t_ρ} for N=500). If only Y has drift (Model 3), the cointegration equation can be expressed as Y_t = α + γ t + X_t β + ε_t. Therefore, the same critical values of Model 2a apply to Model 3 for one extra variable t (but not count for K).

Source:
Phillips and Ouliaris (1990)

 Model	K	 1%	 2.5%	   5%	  10%
   2	2	-3.96	-3.64	-3.37	-3.07
	3	-4.31	-4.02	-3.77	-3.45
	4	-4.73	-4.37	-4.11	-3.83
	5	-5.07	-4.71	-4.45	-4.16
	6	-5.28	-4.98	-4.71	-4.43
   2a	2	-3.98	-3.68	-3.42	-3.13
	3	-4.36	-4.07	-3.80	-3.52
	4	-4.65	-4.39	-4.16	-3.84
	5	-5.04	-4.77	-4.49	-4.20
	6	-5.36	-5.02	-4.74	-4.46
	7	-5.58	-5.31	-5.03	-4.73
   3	2	-4.36	-4.07	-3.80	-3.52
	3	-4.65	-4.39	-4.16	-3.84
	4	-5.04	-4.77	-4.49	-4.20
	5	-5.36	-5.02	-4.74	-4.46
	6	-5.58	-5.31	-5.03	-4.73

Table 5: Critical Values for Unit Root and Cointegration Tests Based on Response Surface Estimates

Critical values for unit root and cointegration tests can be computed from the equation:

CV(K, Model, N, sig) = b + b1 (1/N) + b2 (1/N)²

Notation:
Regression Model: 1=no constant; 2=no trend; 3=with trend;
K: Number of variables in cointegration tests (K=1 for unit root test);
N: Number of observations or sample size;
sig: Level of significance, 0.01, 0.05, 0.1.

Source:
J. G. MacKinnon, "Critical Values for Cointegration Tests," Cointegrated Time Series, 267-276.

    K Model sig           b         b1         b2
    1    1    0.01     -2.5658     -1.960     -10.04
    1    1    0.05     -1.9393     -0.398       0.00
    1    1    0.10     -1.6156     -0.181       0.00
    1    2    0.01     -3.4335     -5.999     -29.25
    1    2    0.05     -2.8621     -2.738      -8.36
    1    2    0.10     -2.5671     -1.438      -4.48
    1    3    0.01     -3.9638     -8.353     -47.44
    1    3    0.05     -3.4126     -4.039     -17.83
    1    3    0.10     -3.1279     -2.418      -7.58
    2    2    0.01     -3.9001    -10.534     -30.03
    2    2    0.05     -3.3377     -5.967      -8.98
    2    2    0.10     -3.0462     -4.069      -5.73
    2    3    0.01     -4.3266    -15.531     -34.03
    2    3    0.05     -3.7809     -9.421     -15.06
    2    3    0.10     -3.4959     -7.203      -4.01
    3    2    0.01     -4.2981    -13.790     -46.37
    3    2    0.05     -3.7429     -8.352     -13.41
    3    2    0.10     -3.4518     -6.241      -2.79
    3    3    0.01     -4.6676    -18.492     -49.35
    3    3    0.05     -4.1193    -12.024     -13.13
    3    3    0.10     -3.8344     -9.188      -4.85
    4    2    0.01     -4.6493    -17.188     -59.20
    4    2    0.05     -4.1000    -10.745     -21.57
    4    2    0.10     -3.8110     -8.317      -5.19
    4    3    0.01     -4.9695    -22.504     -50.22
    4    3    0.05     -4.4294    -14.501     -19.54
    4    3    0.10     -4.1474    -11.165      -9.88
    5    2    0.01     -4.9587    -22.140     -37.29
    5    2    0.05     -4.4185    -13.461     -21.16
    5    2    0.10     -4.1327    -10.638      -5.48
    5    3    0.01     -5.2497    -26.606     -49.56
    5    3    0.05     -4.7154    -17.432     -16.50
    5    3    0.10     -4.4345    -13.654      -5.77
    6    2    0.01     -5.2400    -26.278     -41.65
    6    2    0.05     -4.7048    -17.120     -11.17
    6    2    0.10     -4.4242    -13.347       0.00
    6    3    0.01     -5.5127    -30.735     -52.50
    6    3    0.05     -4.9767    -20.883      -9.05
    6    3    0.10     -4.6999    -16.445       0.00

Table 6: Critical Values for the Johansen's Cointegration Likelihood Ratio Test Statistics

Notation:
VAR Model: 1=no constant; 2=drift; 3=trend drift
N: Sample Size, 400
M: Number of Variables
r: Number of Cointegrating Vectors or Rank
Degree of Freedom = M-r

                    Probabilty to the Right of Critical Value
    Model  M-r      99%   97.5%     95%     90%     80%     50%
λ_max    1     1     6.51    4.93    3.84    2.86    1.82    0.58
       1     2    15.69   13.27   11.44    9.52    7.58    4.83
       1     3    22.99   20.02   17.89   15.59   13.31    9.71
       1     4    28.82   26.14   23.80   21.58   18.97   14.94
       1     5    35.17   32.51   30.04   27.62   24.83   20.16
       2     1   11.576   9.658   8.083   6.691   4.905   2.415
       2     2   18.782  16.403  14.595  12.783  10.666   7.474
       2     3   16.154  23.362  21.279  18.959  16.521  12.707
       2     4   32.616  29.599  27.341  24.917  22.341  17.875
       2     5   38.858  35.700  33.262  30.818  27.953  23.132
       3     1    6.936   5.332   3.962   2.816   1.699   0.447
       3     2   17.936  15.810  14.036  12.099  10.125   6.852
       3     3   25.521  23.002  20.778  18.697  16.324  12.381
       3     4   31.943  29.335  27.169  24.712  22.113  17.719
       3     5   38.341  35.546  33.178  30.774  27.899  23.211
λ_trace  1     1     6.51    4.93    3.84    2.86    1.82    0.58
       1     2    16.31   14.43   12.53   10.47    8.45    5.42
       1     3    29.75   26.64   24.31   21.63   18.83   14.30
       1     4    45.58   42.30   39.89   36.58   33.16   27.10
       1     5    66.52   62.91   59.46   55.44   51.13   43.79
       2     1   11.586   9.658   8.083   6.691   4.905   2.415
       2     2   21.962  19.611  17.844  15.583  13.038   9.355
       2     3   37.291  34.062  31.256  28.436  25.445  20.188
       2     4   55.551  51.801  48.419  45.248  41.623  34.873
       2     5   77.911  73.031  69.977  65.956  61.566  53.373
       3     1    6.936   5.332   3.962   2.816   1.699   0.447
       3     2   19.310  17.299  15.197  13.338  11.164   7.638
       3     3   35.397  32.313  29.509  26.791  23.868  18.759
       3     4   53.792  50.424  47.181  43.964  40.250  33.672
       3     5   76.955  72.140  68.905  65.063  60.215  52.588

Y_t	= ΔY_t + Y_t-1
	= ΔY_t + ΔY_t-1 + Y_t-2 = ...
	= ∑_j=0,...,t-1ΔY_t-j with a known Y₀

Y_t	= ∑_j=0,...,t-1ΔY_t-j
	= ∑_j=0,...,t-1∑_{i=0,...,t-j-1}Δ²Y_t-j-i

Hypothesis	H₀: ρ = 1 H₁: ρ < 1
Test Statistic	t_ρ = (p-1)/se(p) p is the estimated ρ
Critical Value	ADF_{t_ρ}(I,N,e)

Hypothesis	H₀: ρ = 1 H₁: ρ < 1	H₀: α = 0, given ρ = 1 H₁: α ≠ 0
Test Statistic	t_ρ = (p-1)/se(p) p is the estimated ρ	t_α = a/se(a) a is the estimated α
Critical Value	ADF_{t_ρ}(II,N,e)	ADF_{t_α}(II,N,e)

Hypothesis	H₀: α = 0, ρ = 1 H₁: not H₀
Restricted Model	ΔY_t = ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
Test Statistic	F_α,ρ = (RSS_r-RSS_ur)/2 / RSS_ur/(N-J-2)
Critical Value	ADF_{F_α,ρ}(II,N,e)

Hypothesis	H₀: α = 0, β = 0, ρ = 1 H₁: not H₀	H₀: β = 0, ρ = 1 H₁: not H₀
Restricted Model	ΔY_t = ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t	ΔY_t = α + ∑_j=1,2,...,J ρ_jΔY_t-j+ε_t
Test Statistic	F_α,β,ρ = (RSS_r-RSS_ur)/3 / RSS_ur/(N-J-3)	F_β,ρ = (RSS_r-RSS_ur)/2 / RSS_ur/(N-J-3)
Critical Value	ADF_{F_α,β,ρ}(III,N,e)	ADF_{F_β,ρ}(III,N,e)

D(T_B)_t =	1, if t = T_B+1
	0 otherwise
DU_t =	1, if t>T_B
	0 otherwise
DT_t =	t-T_B, if t>T_B
	0 otherwise

ΔY_t	= ΔX_tβ + Δε_t
	= ΔX_tβ + (ρ-1)ε_t-1 + u_t
	= ΔX_tβ + (ρ-1)(Y_t-1-α-X_t-1β) + u_t

Cointegrating Rank (r)	H₀: r1 = r H₁: r2 = r+1	H₀: r1 = r H₁: r2 = M
0	-N ln(1-λ₁)	-N ∑_i=1,2,...,Mln(1-λ_i)
1	-N ln(1-λ₂)	-N ∑_i=2,3,...,Mln(1-λ_i)
...	...	...
M-1	-N ln(1-λ_M)	-N ln(1-λ_M)
Critical Value	λ_max(r)	λ_trace(r)

DU_t(λ) =	1, if t>T_B
	0 otherwise
DT_t(λ) =	t-T_B, if t>T_B
	0 otherwise