Topic 4c

Autoregressive Regression Models

ARMA Analysis for Regression Residuals

AR(1), MA(1), ARMA(1,1)

Auto-Regressive Conditional Heteroscedasticity

The Model: ARCH(1), ARCH-M(1), GARCH(1,1)
Model Identification for ARCH Process
Model Estimation

State-Space Models

Model Representation
Kalman Filter
Applications

Readings and References:

W. H. Greene, Econometric Analysis, 5th Ed., Chapter 20: Time Series Models, Prentice-Hall, 2003.
K.-P. Lin, Computational Econometrics: GAUSS Programming for Econometricians and Financial Analysts, Chapter 15: Time Series Analysis.
J. D. Hamilton, "State-Space Models," Handbook of Econometrics, Vol. IV, eds. R. F. Engle and D. L. McFadden, Chapter 50, 3039-3080, Elsevier, 1994 (Paper).
Readings and References on Autoregressive Conditional Heteroskedasticity
- T. Bollerslev, "Generalized Autoregressive Conditional Heteroskedasticity," Journal of Econometrics 31, 1986, 307-327.
- T. Bollerslev, " A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return," Review of Economics and Statistics 69, 1987, 542-547 (Paper).
- T. Bollerslev and E. Ghysels, "Periodic Autoregressive Conditional Heterscedasticity," American Statistical Association Journal of Business and Economic Statistics 14, 1996, 139-151.
- R. F. Engle, "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation," Econometrica 50, 1982, 987-1006 (Paper).
- R. F. Engle, D. M. Lilien, and R. P. Robins, "Estimating Time-Varying Risk Premia in the Term Structure: the ARCH-M Model," Econometrica 55, 1987, 391-407 (Paper).
- L. R. Glosten, R. Jagannathan, and D. Runkle, "Relationship Between the Expected Value and the Volatility of the Normal Excess Return on Stocks," Journal of Finance, 48, 1993, 1779-1801 (Paper).
- D. B. Nelson, "Conditional Heteroskedasticity in Asset Returns: A New Approach," Econometrica 59, 1991, 347-370 (Paper).

ARMA Analysis for Regression Residuals

Y_t = X_tb + e_t
e_t = r₁e_t-1 + r₂e_t-2 + ... + r_pe_t-p - q₁u_t-1 - q₂u_t-2 - ... - q_qu_t-q + u_t

Y_t = X_tb + r(B)^-1q(B)u_t

where u_t ~ nii(0,s²).

AR(1) Process

e_t = r e_t-1 + u_t

We assume |r| < 1 for model stability. It is clear that

s² = Var(u_t) = (1-r²) Var(e_t).

Denote the variable transformations Y_t^* = Y_t - r Y_t-1 and X_t^* = X_t - r X_t-1. Since u₁ = (1-r²)^½ e₁, the otherwise lost first observation is kept with the transformations Y₁^* = (1-r²)^½Y₁ and X₁^* = (1-r²)^½X₁.

Thus model for estimation is

u_t = Y_t^* - X_t^*b

with the following Jacobian transformation from u_t to Y_t (depending on r only):

J_t(r) = |¶u_t / ¶Y_t| = (1-r²)^½ for t=1

1 for t>1

Therefore, the (exact) concentrated log-likelihood function is:

ll^*(b,r|Y,X) = -½N (1+ln(2p)-ln(N)) +½ ln(1-r²) -½N ln(u'u)

Extension: AR(2)

The model is defined as e_t = r₁e_t-1 + r₂e_t-2 + u_t with the following proper data transformation (Z is referenced as either X or Y below):

Z₁^* = [(1+r₂)((1-r₂)²-r₁²) / (1-r₂)]^½ Z₁
Z₂^* = (1-r₂²)^½Z₂ - [r₁(1-r₁²)^½/(1-r₂)]Z₁
Z_t^* = Z_t - r₁Z_t-1 - r₂z_t-2, t=3,4,...,N.

Pre-Sample Data Initialization

The alternative pre-sample data initialization may be used to transform the time series:

Y₀ = Y_-1 = ... = å_t=1,2,...,NY_t/N
X₀ = X_-1 = ... = å_t=1,2,...,NX_t/N

The resulting maximum likelihood estimation is conditional to the pre-sample data initialization.

MA(1) Process

e_t = u_t - qu_t-1

Again, we assume |q| < 1 for model stability. The model is

u_t = Y_t - X_tb + qu_t-1

Notice that the one-period lag of error terms, u_t-1, is used to define the model error u_t. A recursive calculation is needed with proper initialization of u₀. For example, set the initial value u₀ = E(u_t) = 0 (or alternatively the sample mean of u_t), then u₁ = Y₁-X₁b and u_t = Y_t-X_tb + qu_t-1 for t=2,...,N.

Since each log-jacobian term vanishes in this case, the (conditional) concentrated log-likelihood function is simply

ll^*(b,q|Y,X) = -½N (1+ln(2p)-ln(N)) -½N ln(u'u)

ARMA(1,1) Process

e_t = r e_t-1 + u_t - q u_t-1

This is the mixed process of AR(1) and MA(1). Using the variable transformations as of AR(1) and data initialization as of MA(1), the model is written as

u_t = Y_t^* - X_t^*b + q u_t-1

and the (conditional) concentrated log-likelihood function for parameter estimation is

ll^*(b,r,q|Y,X) = -½N (1+ln(2p)-ln(N)) +½ ln(1-r²) -½N ln(u'u)

Example

This example demonstrates the nonlinear maximum likelihood estimation for three basic autocorrelated regression models: AR(1), MA(1), and ARMA(1,1). Based on the U. S. investment data from Greene's Table 13.1, formulate and estimate the three models of autocorrelation for a linear real investment relationship with real GNP and real interest rate (Program and Data):

Invest = b₀ + b₁ Rate + b₂ GNP + e

AR(1): e_t = r e_t-1 + u_t
MA(1): e_t = u_t - qu_t-1
ARMA(1,1): e_t = r e_t-1 + u_t - q u_t-1

Auto-Regressive Conditional Heteroscedasticity

In many financial and monetary economic applications, serial correlations over time are characterized not only in the means but also in the variances. The latter is the so-called Auto-Regressive Conditional Heteroscedasticy or ARCH models. It is possible that the variance is unconditionally homogenous.

The Model

Consider the time series regression model:

Y_t = X_tb + e_t

At time t, conditional to the available historical information H_t, we assume that the error structure follows a normal distribution:

e_t|H_t ~ n(0,s²_t)

where s²_t = a₀ + d₁s²_t-1 + ... + d_ps²_t-p + a₁e²_t-1 + ... + a_qe²_t-q

= a₀ + S_i=1,2,...pd_is²_t-i + S_j=1,2,...qa_je²_t-j

Let u_t = e²_t-s²_t, a_i = 0 for i > q, d_j = 0 for j > p, and m = max(p,q), the above GARCH(p,q) process may be conveniently re-written as an ARMA(m,p) model for e²_t. That is,

e²_t = a₀ + S_i=1,2,...m (a_i+d_i)e²_t-i - S_j=1,2,...pd_ju_t-j + u_t

This is the general specification of auto-regressive conditional heteroscedasticity, or GARCH(p,q), according to Bollerslev [1986]. If p = 0, then it is the GARCH(0,q) or simply ARCH(q) process:

s²_t = a₀ + S_j=1,2,...qa_je²_t-j

ARCH(1) Process

The simplest case is q = 1, or ARCH(1), originated in Engle [1982] as follows:

s²_t = a₀ + a₁e²_t-1

ARCH(1) model can be summarized as follows:

Y_t = X_tb + e_t
e_t = u_t(a₀ + a₁e²_t-1)^½ where u_t ~ nii(0,1)

Then, the conditional means E(e_t|e_t-1) = 0 and the conditional variances s²_t = E(e²_t|e_t-1) = a₀ + a₁e²_t-1

Note that the unconditional variance of e_t is

E(e²_t) = E(E(e²_t|e_t-1)) = a₀ + a₁E(e²_t-1).

If s² = E(e²_t) = E(e²_t-1), then s² = a₀/(1-a₁) provided that |a₁| < 1. Therefore, the model may be free of general heteroscedasticity although the conditional heteroscedasticity is assumed.

The ARCH(1) process can be generalized (therefore the name Generalized Auto-Regressive Conditional Heteroscedasticity) to:

GARCH(1,1) Process

s²_t = a₀ + a₁ e²_t-1 + d₁ s²_t-1

This resembles the mixed auto-regressive moving-average process ARMA(1,1) as described in autocorrelation. Presample variances and squared error terms can be initialized with S_t=1,2,...,N e²_t/N. The following parameter restrictions are necessary to preserve stationarity of the error process:

a₀ > 0
a₁ ³ 0
d₁ ³ 0
a₁ + d₁ < 1

Another extension is ARCH or GARCH in mean (ARCH-M or GARCH-M model) which adds the heteroscedastic variance term directly into the regression equation (assuming linear model):

ARCH-M(1) or GARCH-M(1,1) Model

e_t = Y_t - X_tb - gs²_t

s²_t = a₀ + a₁ e²_t-1 (or s²_t = a₀ + a₁ e²_t-1 + d₁ s²_t-1)

The last variance term of the regression may be expressed in log form or in standard error s_t. For example, Y_t = X_tb + gln(s²_t) + e_t. Moreover, constraints on the parameters in the conditional variance equation may be required to ensure the positivity of variances: a₀ > 0, 0 £ a₁ < 1 (or a₁ + d₁ < 1, d₁ ³ 0).

Asymmetric GARCH(1,1) Model

There are many evidences in the financial markets that a negative surprise (change in asset returns) tends to increase volatility (variance or risk) more than positive surprise. Therefore, not only the size of the return but also the sign (negative or positive) are important in describing the characteristics of the variance of the asset returns. Consider the following simple model:

Y_t = X_tb + e_t
e_t = s_tu_t

Based on GJR Specification (Glosten-Jagannathan-Runkle, 1993),

s_t² = a₀ + a₁e_t-1² + d₁s_t-1² + g₁(e_t-1²D_t-1)

where D_t-1 = 1 if e_t-1 > 0

0 otherwise

The parameter g₁ < 0 is sometimes referred as the Leverage Effect. The non-negativity of s_t² is satisfied provided that a₀ > 0, d₁ ³ 0 a₁+g₁ ³ 0.

Model Identification for ARCH and GARCH Processes

Autocorrelation Function and Partial Autocorrelation Function based on the squares of regression residuals e_t (or the standardized residuals e_t/s_t if s_t is suspect of non-constancy).
Engle-Bollerslev LM Test of GARCH Effects (Bollerslev [1986]).
Testing H₀: a₁ = a₂ = ... = a_q = 0 for the linear regression equation e²_t = a₀ + a₁e²_t-1 + a₂e²_t-2 + ... + a_qe²_t-q + u_t, based on the test statistic NR² ~ Chi-Square(q).

Model Estimation

Recall the normal log-likelihood of a heteroscedastic regression model:

ll = -½N ln(2p) - ½ å_t=1,2,...,Nln(s²_t) - ½ å_t=1,2,...,N(e²_t / s²_t)

with the general conditional heteroscedastic variance GARCH(p,q) process:

s²_t = a₀ + a₁e²_t-1 + a₂e²_t-2 + ... + a_qe²_t-q + d₁s²_t-1 + d₂s²_t-2 + ... + d_ps²_t-p

The parameter vector (a, d) is estimated together with the regression parameters (e.g., e_t = Y_t - X_tb) by maximizing the log-likelihood, conditional to the starting values e₀², e²_-1, ..., e²_-q, s²₀, s²_-1, ..., s²_-p and satisfying the nonnegativity requirement for the estimated variances: s²_t > 0, t=1,2,...,N.

We note that the presample series: e₀², e²_-1, ..., e²_-q, s²₀, s²_-1, ..., s²_-p may be initialized by the estimated (homoschedastic) unconditional variance:

1 / [1 - (å_i=1,2,...,qa_i + å_j=1,2,...,pd_j)]

or by the estimated sample variance of residuals:

å_t=1,2,...,Ne²_t/N,

Example

This example investigates the "long-run volatility" persistence of Deutschemark-British pound exchange rate (Bollerslev and Ghysels [1986]). Data of daily exchange rates from January 3, 1984 to December 31, 1991 (1974 observations) are used (see DMBP.TXT).

The model of interest is

Y_t = 100 [ln(P_t - ln(P_t-1)] = m + e_t

where P_t is the bilateral spot Deutschemark-British pound exchange rate. Thus Y_t is the daily percentage nominal returns of BM/BP exchange. Test, identify, and estimate the appropriate GARCH(p,q) variance structure (Program).

Example

U. S. inflation measured as the quarterly rate of change in the log of the price:

dP_t = 100 [ln(P_t) - ln(P_t-1)]

is believed to be effected by the previous excess monetary growth (faster than the growth of real output) and by the external shocks. Excess monetary growth is defined as dM - dY, where

dM_t = 100 [ln(M1_t) - ln(M1_t-1)]
dY_t = 100 [ln(GNP_t) - ln(GNP_t-1)]

The basic model is represented by the following:

dP_t = b₀ + b₁(dM_t-1-dY_t-1) + e_t

In addition, the lagged values of the inflation rate (or the disturbance) will carry the effects of external shocks to the economy.

The data file USINF.TXT consists of 136 quarterly observations (from 1950 Q1 to 1984 Q4) of data series for price (implicit deflator for GNP) P_t, money stock M1_t, and output (GNP) Y_t. Identify and estimate the best model of U. S. inflation rate in which serial correlations may exist in the means or in the variances or in both (see Greene [1999], Example 18.11) (Program).

Example

To demonstrate the ARCH innovation process for the U. S. inflation rate defined in the previous example, dP_t may be specified with a a combination of distributed lags, ARMA, and GARCH models (see Greene [1999], Example 18.12) (Program).

State-Space Models

State-space analysis deals with dynamic time series models that involve unobserved state variables such as inflation expectation, permanent income, time-varying parameters, etc.. The basic tool used to study the state-space model is the Kalman Filter, which is a recursive algorithm for estimating the unobserved component or state vector at time t, based on available information through time t-1.

Model Representation

A state-space model consists of two equations:

Measurement Equation (Observation Equation): The relationship between observed variables (nx1 data vector Y_t) and unobserved state variables (kx1 parameter vector b_t).
Y_t = H_tb_t + a_t + u_t
where H_t is an nxk matrix and a_t is an nx1 vector, which may be either data on exogenous variables or constant parameters. That is, given the exogenous or predetermined observed variables X_t, we may define H_t = H(X_t) and a_t = a(X_t).
We assume u_t ~ nii(0_nx1,R_nxn). Note that the covariance matrix R may also depend on X_t.
Transition Equation (State Equation): The first-order difference equation describing the dynamics of the state variables.
b_t = c_t + F_tb_t-1 + v_t
where F_t is an kxk matrix and c_t is an kx1 vector.
We assume v_t ~ nii(0_kx1,Q_kxk) and Cov(u_t,v_s) = E(u_tv_s') = 0_nxk. Note that c_t = c(X_t), F_t = F(X_t), and the covariance matrix Q may depend on X_t.

Conditional to the information available at time t-1, the expected value of b_t is E_t-1(b_t) = c_t + F_tE_t-1(b_t-1). Similarly, the conditional covariance is Var_t-1(b_t) = F_tVar_t-1(b_t-1)F_t' + Q. For notational convenience, let b_t|t-1 = E_t-1(b_t) and W_t|t-1 = Var_t-1(b_t). Then,

b_t|t-1 = c_t + F_tb_t-1|t-1
W_t|t-1 = F_tW_t-1|t-1F_t' + Q

Combining the measurement and transition equations, we have

Y_t = (H_tF_t)b_t-1 + (H_tc_t+a_t) + (H_tv_t+u_t)

Given the information at time t-1, the conditional expectation and covariance of Y_t are:

Y_t|t-1 = E_t-1(Y_t) = H_tb_t|t-1 + a_t
S_t|t-1 = Var_t-1(Y_t) = H_tW_t|t-1H_t' + R

Since Y_t is distributed according to normal(Y_t|t-1,S_t|t-1), the log-likelihood is evaluated as:

ll_t = - ½ ln(2pS_t|t-1) - ½ (Y_t-Y_t|t-1)'S_t|t-1^-1(Y_t-Y_t|t-1)

Kalman Filter

The computation of log-likelihood function for parameter estimation is based on the algorithm of Kalman Filter as follows:

Prediction
b_t|t-1 = c_t + F_tb_t-1|t-1
W_t|t-1 = F_tW_t-1|t-1F_t' + Q
Define the prediction error e_t|t-1 = Y_t - Y_t|t-1. Then
e_t|t-1 = Y_t - H_tb_t|t-1 - a_t
S_t|t-1 = H_tW_t|t-1H_t' + R
Then the log-likelihood is defined by
ll_t = - ½ ln(2pS_t|t-1) - ½ e_t|t-1'S_t|t-1^-1e_t|t-1
Updating
b_t|t = b_t|t-1 + K_te_t|t-1
W_t|t = W_t|t-1 - K_tH_tW_t|t-1
where K_t = W_t|t-1H_t'S_t|t-1^-1 is the Kalman gain.

The above basic filter (prediction and updating) is carried out iteratively from t=1 to t=T. At the end, the sum of log-likelihoods is maximized with respect to the model parameters. To begin at time t=1, the initial values b_0|0 and W_0|0 must be given. If b_t is stationary, then the unconditional expectation and covariance may be used:

b_0|0 = (I-F)^-1c
vec(W_0|0) = (I-FÄF)^-1vec(Q)

If b_t is nonstationary, then we can use a wild guess of b_0|0 (e.g. zeros vector) with large diagonal elements in the covariance matrix W_0|0. In this case, the evaluation of log-likelihood and inference should not include the first few observations of the guess values.

As a by product of maximum likelihood estimation, we obtain the estimated (updated) parameter vector and the corresponding covariance matrix at time t: b_t|t and W_t|t, for t=1,...,T. For a better inference, the smoothed parameter vector and the corresponding covariance matrix based on all information in the sample are:

b_t|T = b_t|t + K^*_t+1(b_t+1|T-c_t+1-F_t+1b_t|t)
W_t|T = W_t|t + K^*_t+1(W_t+1|T-W_t+1|t)K^*_t+1'

where K^*_t+1 = W_t|tF_t+1'W_t+1|t^-1. The smoothing is performed from t=T-1 down to t=1 with the initial values b_T|T and W_T|T obtained from the last iteration of the basic filter.

Applications

AR(p) Model
Y_t = d + r₁Y_t-1 + ... + r_pY_t-p + e_t
e_t ~ nii(0,s²)
- Measurement Equation: Y_t = Hb_t + a + u_t ~ nii(0,R), or
  
  Y_t = [1 0 ... 0]
  
  é Y_t ù
  
  ê Y_t-1 ú
  
  ê : ú
  
  ë Y_t-p+1 û
  
  where a = 0, u_t = 0, and R = 0
- Transition Equation: b_t = Fb_t-1 + c + v_t ~ nii(0,Q), or
  
  é Y_t ù
  
  ê Y_t-1 ú
  
  ê : ú
  
  ë Y_t-p+1 û
  
  =
  
  é r₁ r₂ ... r_p-1 r_p ù
  
  ê 1 0 ... 0 0 ú
  
  ê : : : : : ú
  
  ë 0 0 ... 1 0 û
  
  é Y_t-1 ù
  
  ê Y_t-2 ú
  
  ê : ú
  
  ë Y_t-p û
  
  +
  
  é d ù
  
  ê 0 ú
  
  ê : ú
  
  ë 0 û
  
  +
  
  é e_t ù
  
  ê 0 ú
  
  ê : ú
  
  ë 0 û
  
  where Q =
  
  é s² 0 ... 0 ù
  
  ê 0 0 ... 0 ú
  
  ê : : : : ú
  
  ë 0 0 ... 0 û
MA(q) Model
Y_t = m + e_t - q₁e_t-1 - ... - q_qe_t-q
e_t ~ nii(0,s²)
- Measurement Equation: Y_t = Hb_t + a + u_t ~ nii(0,R), or
  
  Y_t = [1 -q₁ ... -q_q]
  
  é e_t ù
  
  ê e_t-1 ú
  
  ê : ú
  
  ë e_t-q û
  
  + m
  
  where u_t = 0, and R = 0
- Transition Equation: b_t = Fb_t-1 + c + v_t ~ nii(0,Q), or
  
  é e_t ù
  
  ê e_t-1 ú
  
  ê : ú
  
  ë e_t-q û
  
  =
  
  é 0 0 ... 0 0 ù
  
  ê 1 0 ... 0 0 ú
  
  ê : : : : : ú
  
  ë 0 0 ... 1 0 û
  
  é e_t-1 ù
  
  ê e_t-2 ú
  
  ê : ú
  
  ë e_t-q-1 û
  
  +
  
  é 0 ù
  
  ê 0 ú
  
  ê : ú
  
  ë 0 û
  
  +
  
  é e_t ù
  
  ê 0 ú
  
  ê : ú
  
  ë 0 û
  
  where Q =
  
  é s² 0 ... 0 ù
  
  ê 0 0 ... 0 ú
  
  ê : : : : ú
  
  ë 0 0 ... 0 û
Time-Varying Parameters Model
Y_t = X_tb_t + e_t
e_t ~ nii(0,s²)
- Measurement Equation: Y_t = H_tb_t + a + u_t ~ nii(0,R)
  where H_t = X_t, a = 0, u_t = e_t, R = s².
- Transition Equation: b_t = Fb_t-1 + c + v_t ~ nii(0,Q)
  where F, c and Q may be defined according to a model specification.

Example

C-J. Kim and C. R. Nelson, "The Time-Varying-Parameter Model for Modeling Changing Conditional Variance: The case of the Lucas Hypothesis," Journal of Business and Economic Statistics, 1989, 433-440.

The State-Space Model Representation

Measurement Equation:
DM_t = b_0t + b_1tDR_t-1 + b_2tDP_t-1 + b_3tSURP_t-1 + b_4tDM_t-1 + u_t
u_t ~ nii(0,s²)
Transition Equation:
b_it = b_it-1 + v_it
v_it ~ nii(0,s_i²) i = 0,1,...,4.

Data Description (Data)

DM = Quarterly M1 growth rate
DR = Change in 3-month T-bill interest rate
DP = Inflation rate as measured by the CPI
SURP = Detrended full employment budget surplus

Fixed Parameters

s², s₀², s₁², s₂², s₃², s₄².

Time-Varying Parameters

b_0t, b_1t, b_2t, b_3t, b_4t.

(Program)

Last updated: 05/21/2007

where s²_t	= a₀ + d₁s²_t-1 + ... + d_ps²_t-p + a₁e²_t-1 + ... + a_qe²_t-q
	= a₀ + S_i=1,2,...pd_is²_t-i + S_j=1,2,...qa_je²_t-j

Autoregressive Regression Models

Table of Contents

Readings and References:

AR(1) Process

MA(1) Process

ARMA(1,1) Process

The Model

Model Identification for ARCH and GARCH Processes

Model Estimation

Model Representation

Kalman Filter

Applications