Consider the M-state (regime or structure) linear regression equation:
Yt = Xtb(i) +
et(i)
et(i) ~
normal(0,s2(i))
where [Yt,Xt] is the observed sample data, t=1,2,...,N, and (i) is the state (regime) indicator, i=1,2,...,M. In particular, assuming normal distribution, the conditional probability density function of Yt occured in the regime i is written as:
f(Yt;b(i),s2(i)) = 1/Ö(2ps2(i)) exp(-½(Yt-Xtb(i))2/s2(i))
In general, the regression parameters could change several times (that is, switching backward and forward) within M possible regimes. Typically, M=2.
Dit = | 1, if t in N(i) |
0 otherwise |
Since Dit is observed, given the conditional probability density function f(Yt;b(i),s2(i)) of Yt, the log-likelihood function for model estimation is:
ll(q) = åt=1,2,...,Nln åi=1,2,...,M Ditf(Yt;b(i),s2(i))
where q = (b(i),s2(i), i=1,2,...,M)' is the parameter vector of the model.
Dit = | 1, if St = i (the regime is i at time t) |
0 otherwise |
Dit is a random discrete variable taking value either 1 or 0, which is not directly observed. Let Dit* be the latent variable for Dit such that Dit* > 0 if Dit = 1, and Dit* £ 0 otherwise (Dit = 0). We write:
Dit* = a(i) + ut
Or, in a more general framework, Dit* = Zta(i) + ut where Zt is a set of exogenous or predetermined variables determining the regime structure.
When Dit* > 0, ut > - a(i). Assuming ut ~ normal(0,1), the probability of Dit = 1 is represented by the cumulative normal density evaluated from -¥ to a(i). Then the probability that St=i is expressed as:
Pr(St=i;a(i)) | = Pr(Dit=1) = Pr(Dit*>0) = Pr(ut>-a(i)) |
= ò-¥ a(i) 1/Ö(2p) exp(-½ z2)dz |
Of course, åi=1,2,...,M Pr(St=i;a(i)) = 1. Then the conditional joint probability density function of Yt and St=i is
f(Yt,St=i;b(i),s2(i),a(i))
= Pr(St=i;a(i))
f(Yt|St=i;b(i),s2(i))
= ò-¥
a(i)1/Ö(2p)
exp(-½ z2)dz
[1/Ö(2ps2(i))
exp(-½(Yt-Xtb(i))2/s2(i))]
Let q(i) = (b(i),s2(i),a(i))' and q =(q(i), i=1,2,...,M)'. Then, the unconditional probability density function of Yt is
f(Yt;q) = åi=1,2,...,M f(Yt,St=i;q(i))
The resulting log-likelihood function is:
ll(q) | = åt=1,2,...,N ln f(Yt;q) |
= åt=1,2,...,N ln åi=1,2,...,M f(Yt,St=i;q(i)) |
As a by-product of the maximum likelihood estimation of the parameter vector q, the estimated conditional probability for the observation t occurred in regime i is computed as follows:
Pr(St=i|Yt;q) | = f(Yt,St=i;q(i)) / f(Yt;q) |
= f(Yt,St=i;q(i)) / åi=1,2,...,M f(Yt,St=i;q(i)) |
f(Yt|St=i;m(i),s2(i)) = 1/Ö(2ps2(i)) exp(-½(Yt-m(i))2/s2(i)) with probability Pr(St=i), i=1,2.
Let Pr(St=1)=p, then Pr(St=2) = 1-p. Here p>0 must be estimated together with the parameters m(i) and s2(i) for i=1,2. The log-likelihood function is:
ll(p,m(1),m(2),s2(1),s2(2)) = åt=1,2,...,N ln [p f(Yt|St=1;m(1),s2(1)) + (1-p) f(Yt|St=2;m(2),s2(2))]
Finally, the conditional probability of INCOME variable Yt drawn from regime 1 is
p f(Yt|St=1;m(1),s2(1)) / [p f(Yt|St=1;m(1),s2(1)) + (1-p) f(Yt|St=2;m(2),s2(2))]
Pr(St=i|St-1=j) = pij > 0
åi=1,2,...,M pij = 1
Then the joint probability of St=i and St-1=j is defined by
Pr(St=i,St-1=j) | = Pr(St=i|St-1=j)Pr(St-1=j) = pijPr(St-1=j) |
and Pr(St=i) | = åj=1,2,...,M Pr(St=i,St-1=j) |
= åj=1,2,...,M pijPr(St-1=j) |
For a two-state Markov switching process (M=2), we write:
p = p11 = Pr(St=1|St-1=1) | 1-p = p21 = Pr(St=2|St-1=1) |
q = p22 = Pr(St=2|St-1=2) | 1-q = p12 = Pr(St=1|St-1=2) |
Then,
Pr(St=1) = p Pr(St-1=1) + (1-q) Pr(St-1=2)
Pr(St=2) = (1-p) Pr(St-1=1) + q Pr(St-1=2)
The probabilities of a steady state (that is, S = St = St-1) are:
Pr(S=1) = (1-p)/(2-p-q)
Pr(S=2) = (1-q)/(2-p-q)
Let Pr(St=1) = Pr(St*£0) and Pr(St=2) = Pr(St*>0), where St* is a latent variable for the discrete state indicator St defined by:
St* = a0 + a1 (St-1-1) + ut, ut ~ normal(0,1)
Therefore,
p = Pr(St=1|St-1=1) | = Pr(St*£0|St-1=1) = Pr(ut£-a0) |
= ò-¥ -a01/Ö(2p) exp(-½ z2)dz | |
q = Pr(St=2|St-1=2) | = Pr(St*>0) = Pr(ut>-a0-a1) |
= ò-a0-a1 ¥1/Ö(2p) exp(-½ z2)dz | |
= 1 - ò-¥ -a0-a11/Ö(2p) exp(-½ z2)dz |
It clear that through probability transformation, 0 < p < 1 and 0 < q < 1. An alternative approach to ensure that 0 < p < 1 and 0 < q < 1 is to use logistic transformation as follows:
p = exp(p0)/(1+exp(p0))
q = exp(q0)/(1+exp(q0))
where p0 and q0 are unrestricted real numbers.
Let Ht-1 = {Yt-1,Yt-2,...} be the historical information up to time t-1. Suppose the conditional probability density function of Yt (conditional to St=i, St-1=j and Ht-1, ignoring the parameter vector q for the moment) is f(Yt|St=i,St-1=j,Ht-1). Then the conditional joint probability density function of Yt, St, and St-1 (conditional to Ht-1) is:
f(Yt,St=i,St-1=j|Ht-1) = Pr(St=i,St-1=j|Ht-1) f(Yt|St=i,St-1=j,Ht-1)
and
f(Yt|Ht-1;q) = åi=1,2åj=1,2 f(Yt,St=i,St-1=j|Ht-1;q)
where q is the parameter vector. Finally, the conditional log-likelihood function is:
ll(q) = åt=1,2,...,N ln f(Yt|Ht-1;q)
(1) | Pr(St=i,St-1=j|Ht-1) = Pr(St=i|St-1=j) Pr(St-1=j|Ht-1) |
Starting t=1, let Pr(S0=1|H0) = (1-p)/(2-p-q) and Pr(S0=2|H0) = (1-q)/(2-p-q). These are the steady-state or unconditional probability of St=1 and St=2, respectively.
From the computed Pr(St=i,St-1=j|Ht-1) and given f(Yt|St=i,St-1=j,Ht-1), compute the joint pdfs f(Yt,St=i,St-1=j|Ht-1) for all combination of i,j=1,2, and therefore the conditional likelihood at t is:
(2) | f(Yt|Ht-1) | = åi=1,2åj=1,2 f(Yt,St=i,St-1=j|Ht-1) |
= åi=1,2åj=1,2 Pr(St=i,St-1=j|Ht-1) f(Yt|St=i,St-1=j,Ht-1) |
By definition, Pr(St=i,St-1=j|Ht) | = Pr(St=i,St-1=j|Yt,Ht-1) |
= f(Yt,St=i,St-1=j|Ht-1) / f(Yt|Ht-1) | |
= f(Yt|St=i,St-1=j,Ht-1)Pr(St=i,St-1=j|Ht-1) / f(Yt|Ht-1) | |
and Pr(St=i|Ht) | = åj=1,2 Pr(St=i,St-1=j|Ht) |
To iterate from t=1 to N, use Pr(St=i|Ht) to compute Pr(St+1=k,St=i|Ht) as (1) and the conditional likelihood f(Yt+1|Ht) as (2). Finally, we sum all the log of component log-likelihood for maximization with respect to the parameter vector q.
The above algorithm may be generalized straightforwardly to consider M cases (M>2) and L lags (L>1), in which the probabilities and pdfs are evaluated over a large number of ML+1 cases.
For parameter estimation, we apply standard numerical maximization of the log-likelihood function for the entire sample. The alternative method is the EM (Expectation-Maximization) Algorithm.
To make inference about the state variable St, we can use filtered probability: Pr(St=i|Ht), or smoothed probability: Pr(St=i|HN). The later refers to the probability of St=i conditional on all the information in the sample. The smoothed probability is obtained by updating the filtered probability using all information.
Yt = 100*(ln(GDPt)-ln(GDPt-1))
Then 2-state ("boom" and "burst", or expansion and recession) growth in real GDP is modeled as an AR(4) process:
Yt = m(i) +
et, i=1,2
et =
r1et-1 +
r2et-2 +
r3et-3 +
r4et-4 + ut
where ut ~ normal(0,s2). By assuming the first-order Markov switching process, the transition probabilities between two consecutive states St and St-1 are Pr(St=1|St-1=1)=p and Pr(St=2|St-1=2)=q. By definition, Pr(St=1|St-1=2)=1-q and Pr(St=2|St-1=1)=1-p.
The model can be represented equivalently as:
(Yt-m(i)) = r1(Yt-1-m(j)) + r2(Yt-2-m(k)) + r3(Yt-3-m(l)) + r4(Yt-4-m(m)) + ut
with i,j,k,l,m = 1,2. In total, there are 32 (=24+1) cases of probabilities and pdfs forming the likelihood function for maximization with respect to the parameter vector. In addition to the model estimation, both filtered and smoothed probabilities are computed for inference about the state variable St: Pr(St|Ht) and Pr(St|HN) for each observation t. (See Program, require GPE2 application module MARKOV1.GPE [draft version]).
(Yt-m(i)) =
r1(Yt-1-m(j)) +
r2(Yt-2-m(k)) + ut
ut ~ normal(0,s2(i))
with i,j,k = 1,2,3, where Yt is the ex-post real interest rate calculated by subtracting the CPI-based inflation rate from the nominal interest rate (three-month treasury bill rate). See Data.
Transition probability for the 3-state 1st-order Markov process is defined by pij = Pr(St=i|St-1=j) > 0, where åi=1,2,3 pij = 1 for j=1,2,3.
In total, there are 27(=32+1) cases of probabilities and density functions forming the likelihood function for maximization with respect to 14 parameters in the model:
m(1), m(2), m(3)
r1, r2
s2(1), s2(2), s2(3)
p11, p12, p13, p21, p22, p23
Formulate and estimate the log-likelihood function for Garcia-Perron Model of real interest rate. (Program)