1 - r1B - r2B2 - ... - rpBp = 0
must be great than 1 in absoulte value for the model to be stable.
For example, consider the AR(1) model. The characteristic equation is 1 - r1B = 0. The single root of this equation is B = 1/r1, which is greater than 1 in absolute value if |r1| < 1. Similarly, for an AR(2) model, the two roots of the characteristic equation 1 - r1B - r2B2 = 0 are B1,B2 = [r1±Ö(r12+4r2)]/2. Therefore, the stability conditions are:
A more general AR(p) model may be represented by VAR(1):
|
|
| = |
|
|
| + |
|
|
|
|
|
| + |
|
|
|
That is, Yt = a + r Yt-1 + et
By successive substitution, we obtain Yt = a + ra + r2a + ... (so that the equilibrium Y¥ = (I-r)-1a).
The roots of the asymmetric matrix r may be complex in the form a±bi, where i=Ö(-1). The stability requires that all the roots of r must be less than 1 in absolute value. That is, |a+bi| = Ö(a2+b2) < 1.
The unit circle refers to the two-dimentional set of values of a and b defined by a2+b2=1, which defines a circle centered at the origin with radius 1. Therefore, for a stable dynamic model, the roots of the characteristic equation
1 - r1B - r2B2 - ... - rpBp = 0
which are the the reciprocals of the characteristic roots of the matrix r must lie outside the unit circle.
Let Y = {Yt}t=-¥,...,¥ be a covariance stationary process with mean m = E(Yt) and j-th autocovariance gj = E[(Yt-m)(Yt-j-m), j=0,1,2.... We assume gj = g-j, and gj is absolutely summable or åj=0,...,¥ |gj| < ¥.
gY(z) = åj=-¥,...,¥ gjzj
where z denotes a complex scalar.
We note that a complex number can be represented in a two-dimentional (x,y)-space such as
z = x + y i, where i = Ö(-1)
or in the equivalent polar coordinates c (radius) and w (angle):
c = (x2+y2)½
x = c cos(w)
y = c sin(w)
z = c [cos(w) + i sin(w] =
c eiw
Examples
s2 | |
gY(z) = | |
r(z)r(z-1) |
s2q(z)q(z-1) | |
gY(z) = | |
r(z)r(z-1) |
sY(w) = gY(e-iw)/(2p) = 1/(2p) åj=-¥,...,¥ gje-iwj
where w is a real number.
sY(w) is the spectrum or spectral density function for the time series process Y. In other words, for a time series process Y that has the set of autocovariances gj, the spectral density can be computed at any particular value of w. The spectrum contains no new information beyond that in the autocovariances.
Examples
s2 q(e-iw) q(eiw) | |
sY(w) = | |
2p |
s2 | |
sY(z) = | |
2p r(e-iw) r(eiw) |
s2 q(e-iw) q(eiw) | |
sY(z) = | |
2p r(e-iw) r(eiw) |
Consider the following facts:
The spectral density function can be simplied as:
sY(w) = 1/(2p) [g0+2åj=1,...,¥ gjcos(wj)], for wÎ[0,p]
This is a strictly real-valued, continuous function of w. We have sY(w) = sY(-w) and sY(w) = sY(w+M2p) for any integer M. That is, sY(w) is fully defined for wÎ[0,p].
There is also a correspondence between the spectrum and the aucovariances:
gj = ò-pp sY(w)eiwjdw = ò-pp sY(w)cos(wj)dw
In particular, g0 = ò-pp sY(w)dw = 2 ò0p sY(w)dw.
Therefore, spectral analysis can be used to decompose the variance of a time series, which can be viewed as the sum of the spectral densities over all possible frequencies. For example, consider integration over only some of the frequencies:
t(wk) = (2/g0) ò0wk sY(w)dw, where 0 < wk £ p.
Thus, 0 < t(wk) £ 1 is interpreted as the proportion of the total variance of the time series that is associated with frequencies less than or equal to wk.
Yt = m + ò0p [a(w) cos(wt) + d(w) sin(wt)] dw
where a(w) and d(w) are random variables, for any fixed frequency w in [0 p], with the following properties:
Given an observed sample of N observations {Y1,Y2,...,YN}, using the same notation m for the sample mean:
m = åt=1,...,NYt/N
we can calculate the sample autocorvariances gj (j = 0,1,2,...,N-1) as follows:
gj = åt=j+1,...,N (Yt-m)(Yt-j-m)/N
We set gj = g-j. The sample periodogram is defined by
sY(w) | = 1/(2p) åj=-N+1,...,N-1 gje-iwj |
= 1/(2p) [g0+2åj=1,...,N-1 gjcos(wj)] |
The area under the periodogram is the sample variance of Yt:
g0 | = åt=1,...,N (Yt-m)2/N |
= ò-pp sY(w)dw | |
= 2 ò0p sY(w)dw |
For a time series Yt, we observe N periods. That is, {Y1,Y2,..., YN}. Since a wave cycle is completed in 2p radians. Therefore each period should correspond to 2p/N radians (frequency). We let
w1 = 2p/N
w2 = 4p/N
...
wM = 2Mp/N
The highest frequency is obtained at M = (N-1)/2. That is, (N-1)p/N < p.
Sample Spectral Representation Theorem
Given any N observations on a time series process {Y1,Y2,...,YN}, there exist frequencies {w1,w1,...,wM} and coefficients m, {a1,a2,...,aM}, {a1,d2,...,dM} such that
Yt = m + åk=1,...,M {akcos[wk(t-1)] + dksin[wk(t-1)]}
where akcos[wk(t-1)] is orthogonal to ajcos[wj(t-1)] for k¹j, dksin[wk(t-1)] is orthogonal to djsin[wj(t-1)] for k¹j, and akcos[wk(t-1)] is orthogonal to djsin[wj(t-1)] for all k and j. Furthermore,
m = åt=1,...,NYt/N
ak = (2/N) åt=1,...,N
Yt cos[wk(t-1)], k=1,2,...,M
dk = (2/N) åt=1,...,N
Yt sin[wk(t-1)], k=1,2,...,M
The sample variance of Yt can be expressed as
g0 = åt=1,...,N (Yt-m)2/N = (1/2) åk=1,...,M (ak2+dk2)
The portion of the sample variance of Yt that can be attributed to cycles of frequency wk is given by:
(1/2) (ak2+dk2) = (4p/N) sY(wk)
where sY(wk) is the sample perodogram at frequency wk.
Equivalently,
sY(wk) | = [N/(8p] (ak2 + dk2] |
= [1/(2pN)] { {åt=1,...,N Yt cos[wk(t-1)]}2 + {åt=1,...,N Yt sin[wk(t-1)]}2 } |
Yt = d + rYt-1 + et
the mutivariate system of G variables can be written as follows:
Yit = di + åj=1,2,...,G rijYj,t-1 + eit (i=1,2,...,G)
This is called Vector AutoRegressive of order 1, or VAR(1). The matrix representation of the model as a simultaneous linear equations system looks like this:
[Y1t,Y2t,...,YGt] = [d1,d2,...,dG]
+ [Y1,t-1,Y2,t-1,...,YG,t-1] |
|
|
| + | [e1,e2,...,eG] |
|
|
| = |
|
|
| + |
|
|
|
|
|
| + |
|
|
|
In a shorthand notation,
Yt | = d | + r Yt-1 | + et |
(Gx1) | (Gx1) | (GxG)(Gx1) | (Gx1) |
with the following assumptions:
Example: Representing AR(p) as VAR(1)
First, we can write the univariate AR(p) model as the system:
Yt = d + r1Yt-1 +
r2Yt-2 + ... +rpYt-p +
et
Yt-1 = Yt-1
Yt-2 = Yt-2
:
Yt-p+1 = Yt-p+1
Or,
|
|
| = |
|
|
| + |
|
|
|
|
|
| + |
|
|
|
That is,
Yt | = d | + r Yt-1 | + et |
(px1) | (px1) | (pxp)(px1) | (px1), t=p+1,...,N |
This is a system of p equations with restricted parameters matrix r (pxp). The first equation is stochastic, while others are identities. The usable time series observations are from p+1 to N (N-p in total).
Yt = d + r1 Yt-1 + ... + rp Yt-p + et
where Yt, Yt-1, ..., and Yt-p are Gx1 vectors, and rs are GxG coefficient matrix. We can represent this model by a big VAR(1) system in the same way we represent AR(p) as a VAR(1):
Yt | = d | + r Yt-1 | + et |
(Gpx1) | (Gpx1) | (GpxGp)(Gpx1) | (Gpx1), t=p+1,...,N |
The size of the problem is (N-p)Gp with the first set of G stochastic equations. Then the parameter matrix r of the lag variable Yt-1 is
r = |
|
|
|
where, for each k = 1,2,...,p, rk = [rij,k (i,j=1,2...,G)]. Furthermore, I is GxG identity matrix, and 0 is GxG zeros matrix.
It is of particular interest to examine the first set of G (stochastic) equations:
Yt | = d | + rYt-1 | + et |
(Gx1) | (Gx1) | (GxGp)(Gpx1) | (Gx1), t=p+1,...,N |
where r = [r1,r2,...,rp]. The model can be easily generalized to include K exogenous variables Xt as follows:
Yt | = d | + rYt-1 | + bXt | + et |
(Gx1) | (Gx1) | (GxGp)(Gpx1) | (GxK)(Kx1) | (Gx1), t=p+1,...,N |
with the following assumptions:
Let X and Y are expressed in deviation form, and denote:
X ® Y | X Granger cause Y |
Y ® X | Y Granger cause X |
X « Y | X Granger cause Y and Y Granger cause X (Feedback) |
Does X Granger Cause Y? | Does Y Granger Cause X? | |
---|---|---|
Hypothesis | H0: X does not cause Y H1: X causes Y | H0: Y does not cause X H1: Y causes X |
Unrestricted Model | Yt = åj=1,2,...,majYt-j + åk=1,2,...,nbkXt-k + et | Xt = åj=1,2,...,majYt-j + åk=1,2,...,nbkXt-k + et |
Restricted Model | Yt = åj=1,2,...,majYt-j + et | Xt = åk=1,2,...,nbkXt-k + et |
Test Statistic | F = ((RSSR-RSSUR)/n) / (RSSUR/(N-m-n)) | F = ((RSSR-RSSUR)/m) / (RSSUR/(N-m-n)) |
Granger Causality Test | If F³Fc(n,N-n-m) then reject H0. That is, X does cause Y. Otherwise, X does not cause Y. | If F³Fc(m,N-n-m) then reject H0. That is, Y does cause X. Otherwise, Y does not cause X. |
Conclusion | ||
X ® Y | Reject H0 | Not Reject H0 |
Y ® X | Not Reject H0 | Reject H0 |
X « Y | Reject H0 | Reject H0 |
The better approach is to estimate the system of two equations as follows:
Yt = åj=1,2,...,majYt-j +
åk=1,2,...,nbkXt-k +
et
Xt = åj=1,2,...,majYt-j +
åk=1,2,...,nbkXt-k + et
If X ® Y, all a's equal to 0.
If Y ® X, all b's equal to 0.
[I-rB]Yt = d + et
where B is the backshift operator. Then,
Yt = [I-r]-1d + åi=0,1,...,¥ riet-i
= Y* + (et + r1et-1 + r2et-2 + ...)
For the G stochastic equations system, the parameters of interest is ri[1:G,1:G] = [rkji, k,j=1,2,...,G], i=1,2,...
Y* is the equilibrium and et is the innovation. By shocking one element of et, says ejt, Yt will move away from the equilibrium Y*. Note that the effect of Yt due to change of ejt is not just on the jth variable alone but also on other variables in the system. The path whereby the variables returns to equilibirum is called the Impulse Responses of a stable VAR system. The Impulse Response Function traces the effects of a one-time innovation ejt on the k-th variable over time (i=0,1,2,...) as measured by rikj (k,j = 1,2,...,G).
Let S = P'P (Cholesky decomposition, for example), and define et = P-1et. Then E(etet') = I. We can use P-1 to orthogonalize the et so that
Yt = [I-r]-1d + åi=0,2...,¥ r*iet-i
= Y* + (et + r*1et-1 + r*2et-2 + ...)
where r*i = riP (i=0,1,2,...).
Choosing a P is very similar to placing identification restrictions on a system of dynamic simultaneous equations. Given the chosen matrix P, et would be mutually orthogonal implying that r*i would have the usual causal interpretation: the element rkj*i of r*i[1:G,1:G] gives the effect of a one-time unit change in the j-th element of et on the k-th element of Yt after i periods, holding everything else constant.
Yt = d
A short-run structural VAR model can be written as
A(Yt - d
or, Yt - d
where A and B are GxG nonsiguar matrices of parameters to be estimated, et is a Gx1 vector of disturbances with et ~ N(0,I), and E(etet') = 0. Sufficient constraints must be placed on A and B so that P=A-1B is identified.
For example, to define a recursive system for G = 3, let
A = |
|
|
| and | B = |
|
|
|
(To Be Continued)
Yt = d + r1 Yt-1 + ... + rp Yt-p + et
Then,
DYt = d + pYt-1 + g1DYt-1 + ... + gp-1DYt-p+1 + et
where p = åj=1,...,prj - I and gi = - åj=i+1,...,prj, i=1,2,...p-1.
The GxG matrix p has rank r (0 £ r < G). Assume that p has a reduced rank (that is 0 < r < G, the variables cointegrate) so that it can be expressed as p = ab' where a and b has are both Gxr matrices of rank r.
Allowing for a constant and a linear trend and assuming that there are r cointegrating relations, the VEC model is written as:
DYt = d + tt + ab'Yt-1 + g1DYt-1 + ... + gp-1DYt-p+1 + et
Because VEC models the differences of the data, the constant d implies a linear time trend in the levels, and the time trend tt implies a quardratic trend in the levels of the data. By the following reparameterization,
The model can be re-written as:
DYt = d1 + t1t + a(b'Yt-1 + d0 + t0t) + g1DYt-1 + ... + gp-1DYt-p+1 + et
Note that a is a Kxr rank matrix, d0 and t0 are rx1 vectors of parameters and d1 and t1 are Kx1 vectors of parameters. Also, d1'ad0 = 0 and t1'at0 = 0.
Placing restrictions on the trend terms in the above VEC model yields five cases:
Therefore, for model estimation, we have to identify the type of model (1-5), the number of lags (p), and the number of cointegrating ranks (r).
The parameters of interest are:
(To Be Continued)