Panel Data

Table of Contents

The Model

Model Estimation

Extensions

Example: Cost Function


The Model

For each cross section (individual) i=1,2,...N and each time period (time) t=1,2,...T,

Yit = Xitβit + εit

Let βit = β and assume εit = ui + vt + eit where ui represents the individual or cross section differnence in intercept and vt is the time difference in intercept. Two-ways analysis includes both time and individual effects. For simplicity, we further assume vt = 0. That is, there is no time effect. In other words, only the one-way individual effects will be analyzed in the following.

The component eit is a classical error term, with zero mean, homogeneous variance, and there is no serial correlation and no contemporary correlation. Also, eit is uncorrelated with the regressors Xit. That is,

Fixed Effects Model

Assume that the error component ui, the individual differnence, is fixed or nonstochastic (but it varies across individuals). Thus, the model error is simply εit = eit. The model is expressed as:

Yit = (Xitβ + ui) + eit

where ui is interpreted as the change in the intercept. Therefore the individual effect is defined as ui plus the intercept.

Random Effects Model

Assume that the error component ui, the individual differnence, is random and satisfies the following assumptions:

Then, the model error is εit = ui + eit with the following structure: In other words, for each cross section i, the variance covariance matrix of the model error εi = [εi1, εi2, ...,εiT]' is the following TxT matrix:

∑ =
|
|
σ2e2uσ2u..σ2u
σ2uσ2e2u..σ2u
::::
σ2uσ2u..σ2e2u
 |
 |
= σ2eI + σ2u1

Let ε be a NT-element vector of the stacked errors ε1, ε2, ..., εN, ε = [ε12, ..., εN]', then E(ε) = 0 and E(εε') = I⊗Σ, where 1 is an NxN matrix of ones, I is an NxN identity matrix, and ∑ is the TxT variance-covariance matrix defined above.


Model Estimation

Let Yi = [Yi1,Yi2,...,YiT]', Xi = [Xi1,Xi2,...,XiT]', and εi = [εi1i2,...,εiT]', then the pooled (stacked) model is

|
|
Y1
Y2
:
YN
 |
 |
=
|
|
X1
X2
:
XN
 |
 |
β +
|
|
ε1
ε2
:
εN
 |
 |

or, Y = Xβ + ε

Fixed Effects Model

Consider the model as follows:

Yit = (Xitβ + ui) + eit (i=1,2,...,N; t=1,2,...,T).

|
|
Y1
Y2
:
YN
 |
 |
=
|
|
X1
X2
:
XN
 |
 |
β +
|
|
u1
u2
:
uN
 |
 |
+
|
|
e1
e2
:
eN
 |
 |

or, Y = Xβ + u + e

Random Effects Model

Recall the pooled model for estimation

Y = Xβ + ε

where ε = [ε12,...,εN]', εi = [εi1i2,...,εiT]', and the random error components εit = ui + eit. By assumptions, E(ε) = 0, and E(εε') = I⊗Σ. The Generalized Least Squares estimates of β is

β = [X'(I⊗Σ-1)X]-1X'(I⊗Σ-1)Y

∑ =
|
|
σ2e2uσ2u..σ2u
σ2uσ2e2u..σ2u
::::
σ2uσ2u..σ2e2u
 |
 |
= σ2eI + σ2u1

Since Σ-1 = (1/σ2e)I + [σ2u/(σ2e-Tσ2u)]1 can be derived from the estimated variance components σ2e and σ2u, in practice the model is estimated using the following partial deviation approach.

Hausman's Test for Fixed or Random Effects

Let bfixed be the estimated slope parameters of the fixed effects model (using dummy variable approach), and brandom be the estimated slope parameters of the random effects model. Moreover, Var(bfixed) and Var(brandom) are the corresponding estimated variance-covariance matrix, respectively. Hausman's test for no difference of these two sets of parameters is a Chi-square test in which the degree of freedom corresponds to the number of slope parameters. The test statistic is defined as follows:

H = (brandom-bfixed)'[Var(brandom)-Var(bfixed)]-1(brandom-bfixed)

Extensions

Unbalanced Panel Data

Panels in which the group sizes (time periods) differ across groups (individuals) are not unusual in empirical panel data analysis. These panels are called unbalanced panels. Estimation for fixed effects and random effects models discussed above must be modified to reflect the structure of unbalanced panels. Modify the dummy variable or deviation approach for estimating the fixed effects with unbalanced panel data is straightforward. However, for the random effects model, by allowing unequal group sizes, there presents the problem of groupwise heteroscedasticity.

Random Coefficients Model

For each corss section i=1,2,...,N, the model is written as:

Yi = Xiβi + εi
βi = β + υi

where Yi = [Yi1,Yi2,...,YiT]', Xi = [Xi1,Xi2,...,XiT]', and εi = [εi1i2,...,εiT]'. We note that not only the intercept but also the slope parameters are random across individuals. The assumptions of the model are:

and

The model for estimation is

Yi = Xiβ + (Xiυi + εi), or
Yi = Xiβ + ωi where ωi = Xiυi + εi, and

The stacked (pooled) model is

Y = Xβ + ω

where ω = [ω1,...,ωN]', and

E(ω) = 0NTx1
Var(ω) = E(ωω') = V =
|
|
Ω10..0
0Ω2..0
::::
00..ΩN
 |
 |

GLS is used to estimate the model. That is,

b* = (X'V-1X)-1X'V-1Y
Var(b*) = (X'V-1X)-1

The computation is based on the following steps (Swamy, 1971):

  1. For each regression equation i, Yi = Xiβi + εi, obtain the OLS estimator of βi:
    bi = (Xi'Xi)-1Xi'Yi
    Var(bi) = (Xi'Xi)-1(XiiXi)(Xi'Xi)-1 = σi2(Xi'Xi)-1+Γ = Vi
    (Taking account of heteroscedasticity, where Vi = σi2(Xi'Xi)-1)
    Note that σi2 is estimated by s2i = ei'ei/(N-K), where ei = Yi - Xibi.
    Then, Vi = si2(Xi'Xi)-1.

  2. For the random coeffcients equation, βi = β + υi, the variance of bi (estimator of βi) is estimated by:
    i=1,...,G(bi-bm)(bi-bm)'/(G-1) = ∑i=1,...,G(bibi'-G bmbm')/(G-1), where bm = ∑i=1,...,Gbi/G.
    Therefore, Γ = ∑i=1,...,G(bibi'-G bmbm')/(G-1) - ∑i=1,...,GVi/G
    Concerning the possibility that Γ may be nonpositive definite, we use
    Γ = ∑i=1,...,G(bibi'-G bmbm')/(G-1).

  3. Write the GLS estimator of β as:
    b* = (X'V-1X)-1X'V-1Y
    = [∑i=1,...,GXiiXi]-1 [∑i=1,...,GXiiYi]
    = [∑i=1,...,GXiiXi]-1 [∑i=1,...,GXiiXibi]
    = [∑i=1,...,G(Γ+Vi)-1]-1 [(Γ+Vi)-1bi]
    = ∑i=1,...,GWibi, where Wi = [∑i=1,...,G(Γ+Vi)-1]-1 [(Γ+Vi)-1].
    Similarly,
    Var(b*) = (X'V-1X)-1 = [∑i=1,...,G(Γ+Vi)-1]-1

The individual parameter vectors may be predicted as follows:

bi* = (Γ+Vi)-1-1b*+Vi-1bi] = Aib* + (I-Ai)bi,
where Ai = (Γ+Vi)-1Γ-1.

Var(bi*) = [Ai  I-Ai]
i=1,2,...,GWi(Γ+Vi)Wi'  Wi(Γ+Vi)
(Γ+Vi)Wi'  (Γ+Vi)
Ai
I-Ai

Seemingly Unrelated System Model

Consider a more general specification of the model:

Yit = Xitβi + εit (i=1,2,...,N; t=1,2,...,T).

Let Yi = [Yi1,Yi2,...,YiT]', Xi = [Xi1,Xi2,...,XiT]', and εi = [εi1i2,...,εiT]', the stacked N equations (T observations each) system is Y = Xβ + ε, or

|
|
Y1
Y2
:
YN
 |
 |
=
|
|
X10..0
0X2..0
::::
00..XN
 |
 |
|
|
β1
β2
:
βN
 |
 |
+
|
|
ε1
ε2
:
εN
 |
 |

Notice that not only the intercept but also the slope terms of the estimated parameters are different across individuals. The error structure of the model is summarized as follows:

Parameter restrictions can be built into the matrix X and the corresponding parameter vector β. The model is estimated using techniques for systems of regression equations.

The system estimation techniques such as 3SLS and FIML should be used for parameter estimation. It is called the Seemingly Unrelated Regression Estimation (SURE) in the current context. Denote b and S as the estimated β and Σ, respectively. Then,

b = [X'(S-1⊗I)X]-1X'(S-1⊗I)Y
Var(b) = [X'(S-1⊗I)X]-1, and
S = ee'/T, where e = Y-Xb is the estimated error ε.


Copyright © Kuan-Pin Lin
Last updated: February 12, 2012