Re: st: Panel ID and Time Variable

 From Ralph.Heinrich@unece.org To statalist@hsphsun2.harvard.edu Subject Re: st: Panel ID and Time Variable Date Tue, 12 Apr 2005 16:36:44 +0200

```

... Just setting the panel ID to nations without setting a time ID
(presumably with - iis nations - ?) will not do anything imho (try writing
. iis, clear
and re-estimate your regression; the result should be unchanged; it should
also be unchanged if you use - iis individuals - or even - iis waves -).
What you get is a "pooled" regression of 40,000 observations with no
distinction between nations, individuals or waves. This means for instance
that you cannot use lags or first differences because no time series
dimension is defined (try it). More importantly, it means that in your
estimation results, the information contained in all observations will
receive the same "weight". By contrast, the advantage of panel regressions
is that you can give different "weights" to information coming e.g. from
one additional observation on a given individual and to information coming
from observing an additional individual (if you are interested in knowing
what color the cows in Scotland typically are, you learn little by
observing a given cow one more time, but you learn a lot by observing one
more cow for the first time :-). In that sense, a pooled model is not using
the information in the data optimally (unless for your purpose equal
weighting is appropriate on a priori grounds).

To run a true panel regression, every observation has to be uniquely
identified, i.e. it has to correspond to one "unit" at one point in time.
It is conceptually not possible to run a panel regression where a variable
is observed more than once for a given "unit" at a given point in time.
Therefore your panel ID has to be the individual (if you absolutely want it
to be nations rather than individuals, then the only way to do it is to
discard the individual-level information, most likely by averaging over
individuals for every nation at every point in time, as you suggested
earlier).

If your panel is just unbalanced (i.e. mostly the same individuals across
time, with some dropping out and some joining, and the dropping out and
joining in can be considered random), use - tsset individuals waves - and
include country dummies in your regressions; Stata handles unbalanced
panels automatically (a problem may occur if you have individuals who are
in the first and last waves but not in the second). If the sets of
individuals in the different waves are (largely) disjoint, however, an
alternative may be to just estimate separate cross-sections for all waves
(again with individuals as units of observation and with country dummies).

Best
R.

Dr. Ralph Heinrich
Economic Affairs Officer
Economic Analysis Division
UN Economic Commission for Europe

room 441
Palais des Nations
1211 Geneva 10
phone: 0041 22 917 1269