Time Series Analysis II

Univariate ARIMA Models

The Empirical Model

Model Identification
Model Estimation
Forecasting

Extensions

Intervention Analysis
Seasonal ARMA and Mixed Model
Fractional Difference of Time Series
Frequency Domain Representation of Time Series (Appendix)

Readings

W. Enders, Chapter 1, 2.
W. H. Green, 7th ed., Chapter 20.

The General Model

Y_t = μ + ψ₁ε_t-1 + ψ₂ε_t-2 + ... + ε_t

Y_t = μ + (1+ψ₁B+ψ₂B²+...)ε_t = μ + ψ(B)ε_t

where

ψ(B) = 1 + ψ₁B + ψ₂B² + ..., and
ε_t ~ ii(0,σ²), t = 1,2,...,N.

Stationarity requirement for the process:

Mean μ = E(Y_t) < ∞

Variance γ₀ = σ²∑_i=0,...,∞ψ_i² < ∞

Autocovariance γ_j = σ²∑_i=0,...,∞ψ_iψ_j+i < ∞

Autocorrelation φ_j = γ_j / γ₀

The plot of aucorrelation coefficients at each lag j = 0,1,2,... is called autocorrelation function. In general, the autocorrelation function of a stationary time series falls to 0 quickly. The autocorrelation function of a non-stationary time series does not converge to zero. However, by differencing the time series, the differenced series may become stationary.

A linear stochastic process called an integrated process of order d, I(d), if the d-th difference of the series is stationary. Note that d is the lowest number of differences required for the resulting series to be stationary. That is, Y_t ~ I(d) if Δ^dY_t is stationary, where

ΔY_t = Y_t-Y_t-1
Δ²Y_t = ΔY_t-ΔY_t-1
...

Moving Average Process of Order q: MA(q)

Consider the special case of linear stochastic process with finite number of lags:

ψ_j = -θ_j, j=1,2,...q

0 for j > q

then

Y_t = μ - θ₁ε_t-1 - θ₂ε_t-2 - ... - θ_qε_t-q + ε_t

= μ + (1 - θ₁B - θ₂B² - ... - θ_qB^q)ε_t

= μ + θ(B) ε_t

where

θ(B) = 1 - θ₁B - θ₂B² - ... - θ_qB^q, and
ε_t ~ ii(0,σ²), t = 1,2,...,N.

Stability Condition
The roots (solutions) of the q-th order polynominal function of B, θ(B), must lie outside the unit circle.
Mean
E(Y_t) = μ
Variance
γ₀ = Var(Y_t) = E[(Y_t-E(Y_t))²]
Let y_t = Y_t-E(Y_t) = Y_t-μ = θ(B)ε_t. Then,
γ₀ = E(y_t²) = E[(θ(B)ε_t)²] = (1+θ₁²+θ₂²+...+θ_q²)σ²
Autocovariance
γ₁ = E(y_ty_t-1) = (-θ₁+θ₁θ₂+θ₂θ₃+...+θ_q-1θ_q)σ²
γ₂ = E(y_ty_t-2) = (-θ₂+θ₁θ₃+θ₂θ₄+...+θ_q-2θ_q)σ²
...
γ_q = E(y_ty_t-q) = -θ_qσ²
or

γ_j = (-θ_j+θ₁θ_j+1+θ₂θ_j+2+...+θ_q-jθ_q)σ², j=1,2,...q

0 otherwise
Autocorrelation (Normalized Autocovariance)
φ_j = γ_j / γ₀, j = 1,2,...
or

φ_j =

-θ_j+θ₁θ_j+1+θ₂θ_j+2+...+θ_q-jθ_q

1+θ₁²+θ₂²+...+θ_q²

, j = 1,2,...q

0 otherwise

Plot of autocorrelation coefficients of a stationary MA(q) process indicates a finite memory pattern up to the q-th lag. Beyond that the autocorrelation function is zero-valued.

Examples

MA(1): Y_t = μ - θ₁ε_t-1 + ε_t
MA(2): Y_t = μ - θ₁ε_t-1 - θ₂ε_t-2 + ε_t

Autoregressive Process of Order p: AR(p)

The autoregressive representation of the general linear stochastic process can be obtained by subsitituting out ε_t-1, ε_t-2, ... squentially. Then,

Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ρ₃Y_t-3 + ... ε_t

Consider only the case with finite number of autoregressive lags. That is,

Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ... + ρ_pY_t-p + ε_t

ρ(B)Y_t = δ + ε_t,

where

ρ(B) = 1 - ρ₁B - ρ₂B² - ... - ρ_pB^p, and
ε_t ~ ii(0,σ²), t = 1,2,...,N.

Stability Condition
The roots (solutions) of the p-th order polynominal function of B, ρ(B), must lie outside the unit circle.
Mean:
μ = E(Y_t) = δ / (1-ρ₁-ρ₂-...-ρ_p)
Variance:
γ₀ = Var(Y_t) = E[(Y_t-E(Y_t))²]
Let y_t = Y_t-E(Y_t) = Y_t-μ, then the AR(p) process can be written as:
y_t = ρ₁y_t-1 + ρ₂y_t-2 + ... + ρ_py_t-p + ε_t. Therefore

γ₀ = E(y_t²) = E[(ρ₁y_t-1 + ρ₂y_t-2 + ... + ρ_py_t-p + ε_t)²]

= ρ₁γ₁ + ρ₂γ₂ + ... + ρ_pγ_p + σ²
Autocovariance
γ₁ = E(y_ty_t-1) = ρ₁γ₀ + ρ₂γ₁ + ρ₃γ₂ + ... + ρ_pγ_p-1
γ₂ = E(y_ty_t-2) = ρ₁γ₁ + ρ₂γ₀ + ρ₃γ₁ + ... + ρ_pγ_p-2
...
γ_p = E(y_ty_t-p) = ρ₁γ_p-1 + ρ₂γ_p-2 + ρ₃γ_p-3 + ... + ρ_pγ₀
...
γ_j = E(y_ty_t-j) = ρ₁γ_j-1 + ρ₂γ_j-2 + ρ₃γ_j-3 + ... + ρ_pγ_j-p, for j > p
Autocorrelation (Normalized Autocovaiance):
φ_j = γ_j / γ₀, j = 1,2,...
That is the Yule-Walker Equations of AR(p) Process:
φ₁ = ρ₁ + ρ₂φ₁ + ρ₃φ₂ + ... + ρ_pφ_p-1
φ₂ = ρ₁φ₁ + ρ₂ + ρ₃φ₁ + ... + ρ_pφ_p-2
...
φ_p = ρ₁φ_p-1 + ρ₂φ_p-2 + ρ₃φ_p-3 + ... + ρ_p
...
φ_j = ρ₁φ_j-1 + ρ₂φ_j-2 + ρ₃φ_j-3 + ... + ρ_pφ_j-p, for j > p
Plot of autocorrelation coefficients φ_j (j=1,2,...) of a stationary AR(p) process indicates an infinite but decay memory pattern.

Examples

AR(1): Y_t = δ + ρ₁Y_t-1 + ε_t
The Yule-Walker Equations:
φ₁ = ρ₁
φ_j = ρ₁φ_j-1, for j > 1.
That is,
φ_j = ρ₁^j, j=1,2,...
AR(2): Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ε_t
The Yule-Walker Equations:
φ₁ = ρ₁ + ρ₂φ₁ (or φ₁ = ρ₁/(1-ρ₂))
φ₂ = ρ₁φ₁ + ρ₂
...
φ_j = ρ₁φ_j-1 + ρ₂φ_j-2, j > 2.

Mixed Autoregressive and Moving Average Process: ARMA(p,q)

Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ... + ρ_pY_t-p - θ₁ε_t-1 - θ₂ε_t-2 - ... - θ_qε_t-q + ε_t

ρ(B)Y_t = δ + θ(B)ε_t,

where

ρ(B) = 1 - ρ₁B - ρ₂B² - ... - ρ_pB^p,
θ(B) = 1 - θ₁B - θ₂B² - ... - θ_qB^q, and
ε_t ~ ii(0,σ²), t = 1,2,...,N.

Stability Condition
The roots (solutions) of the p-th order polynominal function of B for the AR process, ρ(B), and the q-th order polynomial function of B for the MA process, θ(B), must lie outside the unit circle, respectively.
Mean:
μ = E(Y_t) = δ / (1-ρ₁-ρ₂-...-ρ_p)
Variance:
γ₀ = Var(Y_t) = E[(Y_t-E(Y_t))²]
Let y_t = Y_t-E(Y_t) = Y_t-μ, then the ARMA(p,q) process can be written as:
y_t = ρ₁y_t-1 + ρ₂y_t-2 + ... + ρ_py_t-p - θ₁ε_t-1 - θ₂ε_t-2 - ... + θ_qε_t-q + ε_t.
Therefore,
γ₀ = E(y_t²) = ∑_i=1,...,p ρ_iγ_i + [1-∑_i=1,...,q θ_i(ρ_i-θ_i)]σ². That is,

σ² =

γ₀-∑_i=1,...,p ρ_iγ_i

1-∑_i=1,...,q θ_i(ρ_i-θ_i)
Autocovariance
γ₁ = E(y_ty_t-1) = ∑_i=1,...,p ρ_iγ_|i-1| - [θ₁+∑_i=1,...,q-1 θ_i+1(ρ_i-θ_i)]σ².
γ₂ = E(y_ty_t-2) = ∑_i=1,...,p ρ_iγ_|i-2| - [θ₂+∑_i=1,...,q-2 θ_i+2(ρ_i-θ_i)]σ².
...
or,
for j = 1,2,...,q
γ_j = ∑_i=1,...,p ρ_iγ_|i-j| - [θ_j+∑_i=1,...,q-j θ_i+j(ρ_i-θ_i)]σ².

and, for j > p
γ_j = ∑_i=1,...,p ρ_iγ_|i-j|
Autocorrelation (Normalized Autocovaiance):
φ_j = γ_j / γ₀, j = 1,2,...
or,
for j = 1,2,...,q

φ_j = ∑_i=1,...,p ρ_iφ_|i-j| -

[θ_j+∑_i=1,...,q-j θ_i+j(ρ_i-θ_i)] (1-∑_i=1,...,p ρ_iφ_i)

1-∑_i=1,...,q θ_i(ρ_i-θ_i)

and for, j > p
φ_j = ∑_i=1,...,p ρ_iφ_|i-j|
Plot of autocorrelation coefficients φ_j (j=1,2,...) of a stationary mixed ARMA(p,q) process reflects the combination of infinite memory AR and finite memory MA processes. However, after q lags, only the AR process continues.

Examples

ARMA(1,1): Y_t = δ + ρ₁Y_t-1 - θ₁ε_t-1 + ε_t

φ₁ =

(1-θ₁ρ₁)(ρ₁-θ₁)

1-2θ₁ρ₁+θ₁²

φ_j = ρ_jφ_j-1, for j > 1.
ARMA(2,2): Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ... - θ₁ε_t-1 - θ₂ε_t-2 - ... + ε_t

Partial Autocorrelation Function

Deriving from the computation of autocorrelation function (Yule-Walker equations), we obtain the partial autocorrelation coefficients for different order of AR(p) process, p = 1,2,...

φ₁₁ = φ₁ (this is ρ₁ of AR(1))
φ₂₂ = (φ₂-φ₁²) / (1-φ₁²) (this is ρ₂ of AR(2)),
and for additional lags j = 3,4,...:

φ_jj =

φ_j-∑_k=1,...,j-1 φ_j-1,kφ_j-k

1-∑_k=1,...,j-1 φ_j-1,kφ_k

where φ_jk = φ_j-1,k - φ_jjφ_j-1,j-k for k = 1,2,...,j-1.

Plot of partial autocorrelation can reveal the correct order of AR(p) process. In other words, For AR(p), φ_jj = 0 for j > p. For MA(q), φ_jj is non-zeros for all j and exhibits a geometrically decaying pattern. For ARMA(p,q), the decay of partial autocorrelations φ_jj starts after the p-th lag.

The Empirical Model

Given a sample of time series observations Y₁, Y₂, ..., Y_N, sample statistics such as mean, variance, autocovariances, and autocorrelations can be used to identify the structure of the data generating process for Y_t.

μ = ∑_t=1,...,NY_t/N
γ₀ = ∑_t=1,...,N (Y_t-μ)²/N
γ_j = ∑_t=1,...,N (Y_t-μ)(Y_t+j-μ)/N
φ_j = γ_j/γ₀, j = 1,2,...

The structural identification of a time series includes (1) the minimum order d of differencing required on the sample to achieve stationarity; (2) the appropriate order q of a moving-average process; and (3) the appropriate order p of an autoregressive process for the stationary time series. First of all, a rapidly decline pattern of sample autocorrelation plot or correlogram is needed to ensure a stationary time series for further identification and analysis.

Model Identification

From the theoretical investigation of the linear stochastic process, autocorrelation function is useful to identify the order of a moving-average process (that is, MA(q)), while partial autocorrelation is useful to identify the order of an autoregressive process (that is, AR(p)).

Autocorrelation Function (for identifying q)
Bartlett Test
Testing H₀: φ_j = 0 for each j > q (no autocorrelation at each lag j longer than q) based on the Bartlett distribution of the estimated φ_j.
That is, the estimated φ_j ~ normal(0,√(Var(φ_j)) approximately, where
Var(φ_j) = 1/N (1 + 2 ∑_j=1,...,qφ_j²).
Box-Pierece Test and Ljung-Box Test
Testing H₀: φ₁ =... = φ_k =0 (zero autocorrelation coefficients up to some k lags) based on Box-Pierece Q or Ljung-Box Q' Statistic defined as follows:
Q = N ∑_j=1,...,kφ_j²
Q' = N(N+2) ∑_j=1,...,kφ_j²/(N-j)
Both Q and Q' ~ Chi-Square(k).
Partial Autocorrelation Function (for identifying p)
Let φ_jj be the partial autocorrelation coefficient at the j-th lag. That is, φ_jj = ρ_j obtained from the autoregressive regression of the AR(j) model:
Y_t = δ + ρ₁Y_t-1 + ρ₂Y_t-2 + ... + ρ_jY_t-j + ε_t
If the sample series follows a AR(p) process, then the autoregressive coefficient for each lag j longer than p must be zero.
Testing H₀: φ_jj = 0 for each j > p based on the approximated distribution for the estimated φ_jj ~ normal(0,1/√N). Alternatively, the standard error of φ_jj can be obtained from the corresponding estimated autoregressive regression equation for each lag.

Model Estimation

Pre-sample Data Initialization
In order to use all N data observations, initialization may be needed for the following:
- Y₀, Y_-1, ..., Y_-p+1 with E(Y_t) = δ / (1-ρ₁-...-ρ_p)
  Note: ∑_j=1,...,p ρ_j ≠ 1.
- ε₀, ε_-1, ..., ε_-q+1 with E(ε_t) = 0.
Conditional Maximum Likelihood Estimation
The model may be written in the "inverted" form as
θ(B)^-1(-δ+ρ(B)Y_t) = ε_t
where ε_t ~ ii(0,σ²). Conditional to the historical information (Y_N, ..., Y₁), and data initialization (Y₀, ..., Y_-p+1), (ε₀, ..., ε_-q+1), the sum-of-squares is defined by
S = ∑_t=1,2,...,Nε_t²
Assuming ε_t ~ nii(0,σ²) for each observation i, the concentrated log-likelihood function is
ll = -N/2 (1+ln(2π)-ln(N)+ln(S))
The conditional maximum likelihood estimators of ρs, θs, and δ are obtained by maximizing the nonlinear function ll. A set of initial values for the parameters ρs and θs are needed to start the iteration of nonlinear model estimation.
Diagnosis Checking
Further identification on the estimated residuals, with N-(p+q+1) of observations as the estimated model is an ARMA(p,q) process.
- Residual anaysis for normality and correlation
- Autocorrelation correlation coefficients on the residuals: additional MA orders greater than q?
- Partial autocorrelation coefficients on the residuals: additional AR orders greater than p?
- Box-Pierce and Ljung-Box tests for autocorrelation of additional k lags with k-(p+q+1) degrees of freedom.
Model Selection
Select the best model according to the Information Criteria:
- Akaike�s Information Criterion (AIC): AIC = -2ll + 2K
- Bayesian Information Criterion (BIC): BIC = -2ll + ln(N)K
where K = p+q+1 and N is the sample size used for model estimation.

Forecasting

Time series forecasting is based on the estimated model:

ρ(B)Y_t = δ + θ(B)ε_t, t=1,2,...N

where

ρ(B) = 1-ρ₁B-ρ₂B²-...-ρ_pB^p,
θ(B) = 1-θ₁B-θ₂B²-...-θ_qB^q.

Because μ = ρ(B)^-1δ and
ψ(B) = ρ(B)^-1θ(B) = 1 + ψ₁B + ψ₂B² + ...

The forecasting model can be represented as:

Y_t = μ + ψ₁ε_t-1 + ψ₂ε_t-2 + ... + ε_t

One-Step Ahead Forecast
Given the historical information available at the end of estimation period N,
H_N = (Y_-p+1,...,Y_-1,Y₀,Y₁,...,Y_N; ε_-q+1,...,ε_-1,ε₀,ε₁,...,ε_N)
One-step ahead forecast is the conditional expectation of Y_N+1:
Y_N(1) = E(Y_N+1|H_N) = μ + ψ₁ε_N + ψ₂ε_N-1 + ...
Compared with the observed Y_N+1 which is:
Y_N+1 = μ + ε_N+1 + ψ₁ε_N + ψ₂ε_N-1 + ...
One-step ahead forecast error is defined by:
ε_N(1) = Y_N+1-Y_N(1) = ε_N+1
E(ε_N(1)) = 0
σ_N²(1) = Var(ε_N(1)) = Var(ε_N+1) = σ²
Two-Step Ahead Forecast
Similarly, two-step ahead forecast is the conditional expectation of Y_N+2:
Y_N(2) = E(Y_N+2|H_N) = μ + ψ₂ε_N + ψ₃ε_N-1 + ...
Compared with the observed Y_N+2 which is:
Y_N+2 = μ + ε_N+2 + ψ₁ε_N+1 + ψ₂ε_N + ...
Two-step ahead forecast error is defined by:
ε_N(2) = Y_N+2-Y_N(2) = ε_N+2 + ψ₁ε_N+1 = (1 + ψ₁B)ε_N+2
E(ε_N(2)) = 0
σ_N²(2) = Var(ε_N(2)) = Var((1+ψ₁B)ε_N+2) = (1+ψ₁²)σ² > σ_N²(1)
f-Step Ahead Forecast
f-step ahead forecast is the conditional expectation of Y_N+f:

Y_N(f) = E(Y_N+f|H_N) = μ + ψ_fε_N + ψ_f+1ε_N-1 + ...

μ, as f → ∞

Compared with the observed Y_N+f which is:
Y_N+f = μ + ε_N+f + ψ₁ε_N+f-1 + ψ₂ε_N+f-2 + ...
f-step ahead forecast error is defined by:
ε_N(f) = Y_N+f-Y_N(f) = (1+ψ₁B+...+ψ_f-1B^f-1)ε_N+f
E(ε_N(f)) = 0
σ_N²(f) = Var(ε_N(f)) = (1+ψ₁²+...+ψ_f-1²)σ² > ... > σ_N²(1)
(f+1)-Step Ahead Forecast and Forecast Revision
In general, f+1-step ahead forecast is written as
Y_N(f+1) = E(Y_N+f+1|H_N) = μ + ψ_f+1ε_N + ψ_f+2ε_N-1 + ...
Compared with the f-step ahead forecast at N+1 (with the historical information H_N+1 = (H_N,Y_N+1,ε_N+1)):
Y_N+1(f) = E(Y_N+f+1|H_N+1) = μ + ψ_fε_N+1 + ψ_f+1ε_N + ...
Then the forecast revision for Y_N+f+1 is defined by
Y_N+1(f) - Y_N(f+1) = ψ_fε_N+1 = ψ_fε_N(1)
Therefore, with additional information available at N+1, the f-step ahead forecast is just the (f+1)-step ahead forecast made at previous date N, adjusted for one-step forecast error ε_N(1) weighted by the error learning coefficient ψ_f as follows:
Y_N+1(f) = Y_N(f+1) + ψ_fε_N(1)

Extensions

Intervention Analysis

ρ(B)Y_t = δ + γZ_t + θ(B)ε_t

where Z_t is a deterministic fixed variable (e.g., trend, dummy, or step variable). The interpretation of the intervention variable Z_t may be of interest.

Seasonal ARMA and Mixed Model

Seasonal Difference
Denote s the seasonal span (s=4 for quarterly data, s=12 for monthly data), then
Y_t = (1-B^s)Z_t
Seasonal ARMA
ρ^s(B^s)Y_t = δ + θ^s(B^s)ε_t
Multiplicative Mixed Model
ρ^s(B^s)ρ(B)Y_t = δ + θ^s(B^s)θ(B)ε_t

Fractional Difference of Time Series

For a nonstationary time series, a properly intergrated (differenced) series is required for ARMA analysis. By viewing the plot of autocorrelation and partial autocorrelation functions, it may be evident that they decay hyperbolically instead of the exponential damping. To improve the model performance, fractional difference is useful for a time series exhibited long memory process. Consider Y_t is obtained by partially differencing the time series X_t:

Y_t = (1-B)^δX_t, -1 < δ < 1

where B is the backshift operator, and

(1-B)^δ = ∑_{j=0,1,...,∞}

Γ(j+δ)

Γ(j+1)Γ(δ)

(-B)^j

We write X_t = (1-B)^-δY_t. If δ > 0, it is indicative of long memory, and δ < 0 a short memory process. When δ = 0, the process is memoryless. It can be shown that X_t is stationary if -1/2 < δ < 1/2. For δ ≥ 1/2, X_t is nonstationary!

Appendix: Frequency Domain Representation of Time Series

For many disaggregated microeconomic data, usually observed at greater frequency, the frequency domain representation of the time series process (or spectral analysis) is useful. In this framework, we view an observed time series as a weighted sum of underlying series that have different cyclical patterns (e.g., seasonality, business cycle). For autocovariances, in the time domain, we study variations as a function of time. In the frequency domain, the variances of a time series is studied as a function of frequency or wave length of the variation.

Let Y = {Y_t}_{t=-∞,...,∞} be a covariance stationary process with mean μ = E(Y_t) and j-th autocovariance γ_j = E[(Y_t-μ)(Y_t-j-μ), j=0,1,2.... We assume γ_j = γ_-j, and γ_j is absolutely summable or ∑_j=0,...,∞ |γ_j| < ∞.

Autocovariance Generating Function

The autocovariance generating function for the time series process Y is

g_Y(z) = ∑_{j=-∞,...,∞} γ_jz^j

where z denotes a complex scalar.

We note that a complex number can be represented in a two-dimentional (x,y)-space such as

z = x + y i, where i = √(-1)

or in the equivalent polar coordinates c (radius) and ω (angle):

c = (x²+y²)^½
x = c cos(ω)
y = c sin(ω)
z = c [cos(ω) + i sin(ω] = c e^iω

Examples

MA(q): Y_t = μ + θ(B)ε_t = μ + (1 - θ₁B - θ₂B² - ... - θ_qB^q)ε_t
g_Y(z) = σ²θ(z)θ(z^-1)
AR(p): ρ(B)Y_t = δ + ε_t, or (1 - ρ₁B - ρ₂B² - ... - ρ_pB^p)Y_t = δ + ε_t
Equivalently, Y_t = ρ(B)^-1δ + ρ(B)^-1ε_t = μ + ρ(B)^-1ε_t

σ²

g_Y(z) =

ρ(z)ρ(z^-1)
ARMA(p,q): ρ(B)Y_t = μ + θ(B)ε_t

σ²θ(z)θ(z^-1)

g_Y(z) =

ρ(z)ρ(z^-1)

Spectral Density Function (Spectrum)

We now evaluate the autocovariance generating function g_Y(z) at the complex value z = e^-iω and divide it by 2π:

s_Y(ω) = g_Y(e^-iω)/(2π) = 1/(2π) ∑_{j=-∞,...,∞} γ_je^-iωj

where ω is a real number.

s_Y(ω) is the spectrum or spectral density function for the time series process Y. In other words, for a time series process Y that has the set of autocovariances γ_j, the spectral density can be computed at any particular value of ω. The spectrum contains no new information beyond that in the autocovariances.

Consider the following facts:

γ_j = γ_-j (Symmetry of the autocovariances)
exp(±iωj) = cos(ωj) ± i sin(ωj) (DeMoivre's theorem)
Therefore, exp(iωj) + exp(-iωj) = 2 cos(ωj), which is always real.
cos(0) = 1, cos(π) = 0, sin(0) = 0, sin(π) = 1
cos(-ω) = cos(ω), sin(-ω) = -sin(ω)

The spectral density function can be simplied as:

s_Y(ω) = 1/(2π) [γ₀+2∑_j=1,...,∞ γ_jcos(ωj)], for ω∈[0,π]

This is a strictly real-valued, continuous function of ω. We have s_Y(ω) = s_Y(-ω) and s_Y(ω) = s_Y(ω+M2π) for any integer M. That is, s_Y(ω) is fully defined for ω∈[0,π].

Examples

MA(q):

σ² θ(e^-iω) θ(e^iω)

s_Y(ω) =

2π
AR(p):

σ²

s_Y(ω) =

2π ρ(e^-iω) ρ(e^iω)
ARMA(p,q):

σ² θ(e^-iω) θ(e^iω)

s_Y(ω) =

2π ρ(e^-iω) ρ(e^iω)

There is also a correspondence between the spectrum and the aucovariances:

γ_j = ∫_-π^π s_Y(ω)e^iωjdω = ∫_-π^π s_Y(ω)cos(ωj)dω

In particular, γ₀ = ∫_-π^π s_Y(ω)dω = 2 ∫₀^π s_Y(ω)dω.

Therefore, spectral analysis can be used to decompose the variance of a time series, which can be viewed as the sum of the spectral densities over all possible frequencies. For example, consider integration over only some of the frequencies:

τ(ω_k) = (2/γ₀) ∫₀^ω_k s_Y(ω)dω, where 0 < ω_k ≤ p.

Thus, 0 < τ(ω_k) ≤ 1 is interpreted as the proportion of the total variance of the time series that is associated with frequencies less than or equal to ω_k.

Spectral Representation Theorem

Any covariance stationary time series process can be expressed in the form:

Y_t = μ + ∫₀^π [α(ω) cos(ωt) + δ(ω) sin(ωt)] dω

where α(ω) and δ(ω) are random variables, for any fixed frequency ω in [0 π], with the following properties:

E(α(ω)) = E(δ(ω)) = 0
For 0 < ω1 < ω2 < ω3 < ω4 < π,
∫_ω1^ω2 α(ω)dω in uncorrelated with ∫_ω3^ω4 α(ω)dω
∫_ω1^ω2 δ(ω)dω in uncorrelated with ∫_ω3^ω4 δ(ω)dω
For 0 < ω1 < ω2 < π and 0 < ω3 < ω4 < π,
∫_ω1^ω2 α(ω)dω in uncorrelated with ∫_ω3^ω4 δ(ω)dω

Sample Periodogram

For any given ω, we can construct the sample analog of population spectrum, which is known as the sample periodogram.

Given an observed sample of N observations {Y₁,Y₂,...,Y_N}, using the same notation μ for the sample mean:

μ = ∑_t=1,...,NY_t/N

we can calculate the sample autocorvariances γ_j (j = 0,1,2,...,N-1) as follows:

γ_j = ∑_t=j+1,...,N (Y_t-μ)(Y_t-j-μ)/N

We set γ_j = γ_-j. The sample periodogram is defined by

s_Y(ω) = 1/(2π) ∑_{j=-N+1,...,N-1} γ_je^-iωj

= 1/(2π) [γ₀+2∑_j=1,...,N-1 γ_jcos(ωj)]

The area under the periodogram is the sample variance of Y_t:

γ₀ = ∑_t=1,...,N (Y_t-μ)²/N

= ∫_-π^π s_Y(ω)dω

= 2 ∫₀^π s_Y(ω)dω

For a time series Y_t, we observe N periods. That is, {Y₁,Y₂,..., Y_N}. Since a wave cycle is completed in 2π radians. Therefore each period should correspond to 2π/N radians (frequency). We let

ω₁ = 2π/N
ω₂ = 4π/N
...
ω_M = 2Mπ/N

The highest frequency is obtained at M = (N-1)/2. That is, (N-1)π/N < π.

Sample Spectral Representation Theorem

Given any N observations on a time series process {Y₁,Y₂,...,Y_N}, there exist frequencies {ω₁,ω₁,...,ω_M} and coefficients μ, {α₁,α₂,...,α_M}, {δ₁,δ₂,...,δ_M} such that

Y_t = μ + ∑_k=1,...,M {α_kcos[ω_k(t-1)] + δ_ksin[ω_k(t-1)]}

where α_kcos[ω_k(t-1)] is orthogonal to α_jcos[ω_j(t-1)] for k≠j, δ_ksin[ω_k(t-1)] is orthogonal to δ_jsin[ω_j(t-1)] for k≠j, and α_kcos[ω_k(t-1)] is orthogonal to δ_jsin[ω_j(t-1)] for all k and j. Furthermore,

μ = ∑_t=1,...,NY_t/N
α_k = (2/N) ∑_t=1,...,N Y_t cos[ω_k(t-1)], k=1,2,...,M
δ_k = (2/N) ∑_t=1,...,N Y_t sin[ω_k(t-1)], k=1,2,...,M

The sample variance of Y_t can be expressed as

γ₀ = ∑_t=1,...,N (Y_t-μ)²/N = (1/2) ∑_k=1,...,M (α_k²+δ_k²)

The portion of the sample variance of Y_t that can be attributed to cycles of frequency ω_k is given by:

(1/2) (α_k²+δ_k²) = (4π/N) s_Y(ω_k)

where s_Y(ω_k) is the sample perodogram at frequency ω_k.

Equivalently,

s_Y(ω_k) = [N/(8π] (α_k² + δ_k²]

= [1/(2πN)] { {∑_t=1,...,N Y_t cos[ω_k(t-1)]}² + {∑_t=1,...,N Y_t sin[ω_k(t-1)]}² }

Time Series Analysis II

Univariate ARIMA Models

Table of Contents

The Theoretical Model

The Empirical Model

Extensions

Readings

The General Model

Moving Average Process of Order q: MA(q)

Autoregressive Process of Order p: AR(p)

Mixed Autoregressive and Moving Average Process: ARMA(p,q)

Partial Autocorrelation Function

The Empirical Model

Model Identification

Model Estimation

Forecasting

Extensions

Intervention Analysis

Seasonal ARMA and Mixed Model

Fractional Difference of Time Series

Appendix: Frequency Domain Representation of Time Series

Autocovariance Generating Function

Spectral Density Function (Spectrum)

Spectral Representation Theorem

Sample Periodogram

Mean	μ = E(Y_t) < ∞
Variance	γ₀ = σ²∑_i=0,...,∞ψ_i² < ∞
Autocovariance	γ_j = σ²∑_i=0,...,∞ψ_iψ_j+i < ∞
Autocorrelation	φ_j = γ_j / γ₀

Y_t	= μ - θ₁ε_t-1 - θ₂ε_t-2 - ... - θ_qε_t-q + ε_t
	= μ + (1 - θ₁B - θ₂B² - ... - θ_qB^q)ε_t
	= μ + θ(B) ε_t

γ_j =	(-θ_j+θ₁θ_j+1+θ₂θ_j+2+...+θ_q-jθ_q)σ², j=1,2,...q
	0 otherwise

γ₀ = E(y_t²)	= E[(ρ₁y_t-1 + ρ₂y_t-2 + ... + ρ_py_t-p + ε_t)²]
	= ρ₁γ₁ + ρ₂γ₂ + ... + ρ_pγ_p + σ²

Y_N(f) = E(Y_N+f\|H_N) =	μ + ψ_fε_N + ψ_f+1ε_N-1 + ...
	μ, as f → ∞

s_Y(ω)	= 1/(2π) ∑_{j=-N+1,...,N-1} γ_je^-iωj
	= 1/(2π) [γ₀+2∑_j=1,...,N-1 γ_jcos(ωj)]

γ₀	= ∑_t=1,...,N (Y_t-μ)²/N
	= ∫_-π^π s_Y(ω)dω
	= 2 ∫₀^π s_Y(ω)dω

s_Y(ω_k)	= [N/(8π] (α_k² + δ_k²]
	= [1/(2πN)] { {∑_t=1,...,N Y_t cos[ω_k(t-1)]}² + {∑_t=1,...,N Y_t sin[ω_k(t-1)]}² }

	σ²θ(z)θ(z^-1)
g_Y(z) =
	ρ(z)ρ(z^-1)

	σ² θ(e^-iω) θ(e^iω)
s_Y(ω) =
	2π ρ(e^-iω) ρ(e^iω)