Time Series Analysis II

Univariate ARIMA Models

Table of Contents

The Theoretical Model

The Empirical Model

Extensions

Readings


The General Model

Yt = μ + ψ1εt-1 + ψ2εt-2 + ... + εt

or

Yt = μ + (1+ψ1B+ψ2B2+...)εt = μ + ψ(B)εt

where

ψ(B) = 1 + ψ1B + ψ2B2 + ..., and
εt ~ ii(0,σ2), t = 1,2,...,N.

Stationarity requirement for the process:
Mean μ = E(Yt) < ∞
Variance γ0 = σ2i=0,...,∞ψi2 < ∞
Autocovariance γj = σ2i=0,...,∞ψiψj+i < ∞
Autocorrelation φj = γj / γ0

The plot of aucorrelation coefficients at each lag j = 0,1,2,... is called autocorrelation function. In general, the autocorrelation function of a stationary time series falls to 0 quickly. The autocorrelation function of a non-stationary time series does not converge to zero. However, by differencing the time series, the differenced series may become stationary.

A linear stochastic process called an integrated process of order d, I(d), if the d-th difference of the series is stationary. Note that d is the lowest number of differences required for the resulting series to be stationary. That is, Yt ~ I(d) if ΔdYt is stationary, where

ΔYt = Yt-Yt-1
Δ2Yt = ΔYt-ΔYt-1
...

Moving Average Process of Order q: MA(q)

Consider the special case of linear stochastic process with finite number of lags:

ψj = j, j=1,2,...q
0 for j > q

then

Yt = μ - θ1εt-1 - θ2εt-2 - ... - θqεt-q + εt
= μ + (1 - θ1B - θ2B2 - ... - θqBqt
= μ + θ(B) εt

where

θ(B) = 1 - θ1B - θ2B2 - ... - θqBq, and
εt ~ ii(0,σ2), t = 1,2,...,N.

Examples

Autoregressive Process of Order p: AR(p)

The autoregressive representation of the general linear stochastic process can be obtained by subsitituting out εt-1, εt-2, ... squentially. Then,

Yt = δ + ρ1Yt-1 + ρ2Yt-2 + ρ3Yt-3 + ... εt

Consider only the case with finite number of autoregressive lags. That is,

Yt = δ + ρ1Yt-1 + ρ2Yt-2 + ... + ρpYt-p + εt

or

ρ(B)Yt = δ + εt,

where

ρ(B) = 1 - ρ1B - ρ2B2 - ... - ρpBp, and
εt ~ ii(0,σ2), t = 1,2,...,N.

Examples

Mixed Autoregressive and Moving Average Process: ARMA(p,q)

Yt = δ + ρ1Yt-1 + ρ2Yt-2 + ... + ρpYt-p - θ1εt-1 - θ2εt-2 - ... - θqεt-q + εt

or

ρ(B)Yt = δ + θ(B)εt,

where

ρ(B) = 1 - ρ1B - ρ2B2 - ... - ρpBp,
θ(B) = 1 - θ1B - θ2B2 - ... - θqBq, and
εt ~ ii(0,σ2), t = 1,2,...,N.

Examples

Partial Autocorrelation Function

Deriving from the computation of autocorrelation function (Yule-Walker equations), we obtain the partial autocorrelation coefficients for different order of AR(p) process, p = 1,2,...

φ11 = φ1 (this is ρ1 of AR(1))
φ22 = (φ212) / (1-φ12) (this is ρ2 of AR(2)),
and for additional lags j = 3,4,...:
φjj =
φj-∑k=1,...,j-1 φj-1,kφj-k

1-∑k=1,...,j-1 φj-1,kφk
where φjk = φj-1,k - φjjφj-1,j-k for k = 1,2,...,j-1.

Plot of partial autocorrelation can reveal the correct order of AR(p) process. In other words, For AR(p), φjj = 0 for j > p. For MA(q), φjj is non-zeros for all j and exhibits a geometrically decaying pattern. For ARMA(p,q), the decay of partial autocorrelations φjj starts after the p-th lag.

The Empirical Model

Given a sample of time series observations Y1, Y2, ..., YN, sample statistics such as mean, variance, autocovariances, and autocorrelations can be used to identify the structure of the data generating process for Yt.

The structural identification of a time series includes (1) the minimum order d of differencing required on the sample to achieve stationarity; (2) the appropriate order q of a moving-average process; and (3) the appropriate order p of an autoregressive process for the stationary time series. First of all, a rapidly decline pattern of sample autocorrelation plot or correlogram is needed to ensure a stationary time series for further identification and analysis.

Model Identification

From the theoretical investigation of the linear stochastic process, autocorrelation function is useful to identify the order of a moving-average process (that is, MA(q)), while partial autocorrelation is useful to identify the order of an autoregressive process (that is, AR(p)).

Model Estimation

Forecasting

Time series forecasting is based on the estimated model:

ρ(B)Yt = δ + θ(B)εt, t=1,2,...N

where

ρ(B) = 1-ρ1B-ρ2B2-...-ρpBp,
θ(B) = 1-θ1B-θ2B2-...-θqBq.

Because μ = ρ(B)-1δ and
ψ(B) = ρ(B)-1θ(B) = 1 + ψ1B + ψ2B2 + ...

The forecasting model can be represented as:

Yt = μ + ψ1εt-1 + ψ2εt-2 + ... + εt


Extensions

Intervention Analysis

ρ(B)Yt = δ + γZt + θ(B)εt

where Zt is a deterministic fixed variable (e.g., trend, dummy, or step variable). The interpretation of the intervention variable Zt may be of interest.

Seasonal ARMA and Mixed Model

Fractional Difference of Time Series

For a nonstationary time series, a properly intergrated (differenced) series is required for ARMA analysis. By viewing the plot of autocorrelation and partial autocorrelation functions, it may be evident that they decay hyperbolically instead of the exponential damping. To improve the model performance, fractional difference is useful for a time series exhibited long memory process. Consider Yt is obtained by partially differencing the time series Xt:

Yt = (1-B)δXt, -1 < δ < 1

where B is the backshift operator, and

(1-B)δ = ∑j=0,1,...,∞
Γ(j+δ)

Γ(j+1)Γ(δ)
(-B)j

We write Xt = (1-B)Yt. If δ > 0, it is indicative of long memory, and δ < 0 a short memory process. When δ = 0, the process is memoryless. It can be shown that Xt is stationary if -1/2 < δ < 1/2. For δ ≥ 1/2, Xt is nonstationary!


Appendix: Frequency Domain Representation of Time Series

For many disaggregated microeconomic data, usually observed at greater frequency, the frequency domain representation of the time series process (or spectral analysis) is useful. In this framework, we view an observed time series as a weighted sum of underlying series that have different cyclical patterns (e.g., seasonality, business cycle). For autocovariances, in the time domain, we study variations as a function of time. In the frequency domain, the variances of a time series is studied as a function of frequency or wave length of the variation.

Let Y = {Yt}t=-∞,...,∞ be a covariance stationary process with mean μ = E(Yt) and j-th autocovariance γj = E[(Yt-μ)(Yt-j-μ), j=0,1,2.... We assume γj = γ-j, and γj is absolutely summable or ∑j=0,...,∞j| < ∞.

Autocovariance Generating Function

The autocovariance generating function for the time series process Y is

gY(z) = ∑j=-∞,...,∞ γjzj

where z denotes a complex scalar.

We note that a complex number can be represented in a two-dimentional (x,y)-space such as

z = x + y i,     where i = √(-1)

or in the equivalent polar coordinates c (radius) and ω (angle):

c = (x2+y2)½
x = c cos(ω)
y = c sin(ω)
z = c [cos(ω) + i sin(ω] = c e

Examples

Spectral Density Function (Spectrum)

We now evaluate the autocovariance generating function gY(z) at the complex value z = e-iω and divide it by 2π:

sY(ω) = gY(e-iω)/(2π) = 1/(2π) ∑j=-∞,...,∞ γje-iωj

where ω is a real number.

sY(ω) is the spectrum or spectral density function for the time series process Y. In other words, for a time series process Y that has the set of autocovariances γj, the spectral density can be computed at any particular value of ω. The spectrum contains no new information beyond that in the autocovariances.

Consider the following facts:

  1. γj = γ-j (Symmetry of the autocovariances)
  2. exp(±iωj) = cos(ωj) ± i sin(ωj) (DeMoivre's theorem)
    Therefore, exp(iωj) + exp(-iωj) = 2 cos(ωj), which is always real.
  3. cos(0) = 1, cos(π) = 0, sin(0) = 0, sin(π) = 1
  4. cos(-ω) = cos(ω), sin(-ω) = -sin(ω)

The spectral density function can be simplied as:

sY(ω) = 1/(2π) [γ0+2∑j=1,...,∞ γjcos(ωj)],     for ω∈[0,π]

This is a strictly real-valued, continuous function of ω. We have sY(ω) = sY(-ω) and sY(ω) = sY(ω+M2π) for any integer M. That is, sY(ω) is fully defined for ω∈[0,π].

Examples

There is also a correspondence between the spectrum and the aucovariances:

γj = ∫π sY(ω)eiωjdω = ∫π sY(ω)cos(ωj)dω

In particular, γ0 = ∫π sY(ω)dω = 2 ∫0π sY(ω)dω.

Therefore, spectral analysis can be used to decompose the variance of a time series, which can be viewed as the sum of the spectral densities over all possible frequencies. For example, consider integration over only some of the frequencies:

τ(ωk) = (2/γ0) ∫0ωk sY(ω)dω, where 0 < ωk ≤ p.

Thus, 0 < τ(ωk) ≤ 1 is interpreted as the proportion of the total variance of the time series that is associated with frequencies less than or equal to ωk.

Spectral Representation Theorem

Any covariance stationary time series process can be expressed in the form:

Yt = μ + ∫0π [α(ω) cos(ωt) + δ(ω) sin(ωt)] dω

where α(ω) and δ(ω) are random variables, for any fixed frequency ω in [0 π], with the following properties:

Sample Periodogram

For any given ω, we can construct the sample analog of population spectrum, which is known as the sample periodogram.

Given an observed sample of N observations {Y1,Y2,...,YN}, using the same notation μ for the sample mean:

μ = ∑t=1,...,NYt/N

we can calculate the sample autocorvariances γj (j = 0,1,2,...,N-1) as follows:

γj = ∑t=j+1,...,N (Yt-μ)(Yt-j-μ)/N

We set γj = γ-j. The sample periodogram is defined by

sY(ω) = 1/(2π) ∑j=-N+1,...,N-1 γje-iωj
= 1/(2π) [γ0+2∑j=1,...,N-1 γjcos(ωj)]

The area under the periodogram is the sample variance of Yt:
γ0 = ∑t=1,...,N (Yt-μ)2/N
= ∫π sY(ω)dω
= 2 ∫0π sY(ω)dω

For a time series Yt, we observe N periods. That is, {Y1,Y2,..., YN}. Since a wave cycle is completed in 2π radians. Therefore each period should correspond to 2π/N radians (frequency). We let

ω1 = 2π/N
ω2 = 4π/N
...
ωM = 2Mπ/N

The highest frequency is obtained at M = (N-1)/2. That is, (N-1)π/N < π.

Sample Spectral Representation Theorem

Given any N observations on a time series process {Y1,Y2,...,YN}, there exist frequencies {ω11,...,ωM} and coefficients μ, {α12,...,αM}, {δ12,...,δM} such that

Yt = μ + ∑k=1,...,Mkcos[ωk(t-1)] + δksin[ωk(t-1)]}

where αkcos[ωk(t-1)] is orthogonal to αjcos[ωj(t-1)] for k≠j, δksin[ωk(t-1)] is orthogonal to δjsin[ωj(t-1)] for k≠j, and αkcos[ωk(t-1)] is orthogonal to δjsin[ωj(t-1)] for all k and j. Furthermore,

μ = ∑t=1,...,NYt/N
αk = (2/N) ∑t=1,...,N Yt cos[ωk(t-1)], k=1,2,...,M
δk = (2/N) ∑t=1,...,N Yt sin[ωk(t-1)], k=1,2,...,M

The sample variance of Yt can be expressed as

γ0 = ∑t=1,...,N (Yt-μ)2/N = (1/2) ∑k=1,...,Mk2k2)

The portion of the sample variance of Yt that can be attributed to cycles of frequency ωk is given by:

(1/2) (αk2k2) = (4π/N) sYk)

where sYk) is the sample perodogram at frequency ωk.

Equivalently,

sYk) = [N/(8π] (αk2 + δk2]
= [1/(2πN)] { {∑t=1,...,N Yt cos[ωk(t-1)]}2 + {∑t=1,...,N Yt sin[ωk(t-1)]}2 }


Copyright © Kuan-Pin Lin
Last updated: 03/06/2013