# Re: st: Cross-Sectional Time Series

 From Mark Schaffer To statalist@hsphsun2.harvard.edu, anirban basu Subject Re: st: Cross-Sectional Time Series Date Wed, 26 Jun 2002 09:56:52 +0100 (BST)

```Anirban,

Quoting anirban basu <abasu@midway.uchicago.edu>:

>
>
> On Tue, 25 Jun 2002, Mark Schaffer wrote:
>
> >
> > Not quite sure what you mean, so apologies if I'm off
> target.  The
> > coefficient estimates with -regress- won't be the same as
> with
> > -xtreg, fe- (unless the former is estimating the same model
> by
> > explicitly including the fixed effects as dummy vars).  Both
> sets of
> > coefficients will be different again from those produced by
> > -xtreg, re-.
> >
> > --Mark
>
>
> from
> -regress-, xtreg, fe and xtreg, re using a simulated dataset
> with
> exchangeable corr and no dummy vars.

I reproduced your example.  The coefficients are virtually, but not
exactly, the same - they differ after about 12 decimal places.
I think what's happening is that because of the way you set up your
artificial dataset, all 3 estimators are giving you unbiased and
consistent estimates of an assumed linear relationship, and all 3 get
very close.  (The true relationship isn't linear but that doesn't
matter here.)

--Mark

> ALso, I got different
> coeffs by
> running an exponential corr model as expected.. Here are the
> details. I
> may have missed something here..Thanks for your input,
>
> Anirban
>
>
> . mat C= (1, 0.6, 0.6, 0.6 \  0.6, 1, 0.6, 0.6 \ 0.6, 0.6, 1,
> 0.6 \  0.6,
> 0.6, 0.6, 1)
>
> .
> . drawnorm y1 y2 y3 y4, n(1000) means(1 3 4 7) corr(C)
> (obs 1000)
>
> . gen id=_n
>
> . reshape long y , i(id) j(time)
> (note: j = 1 2 3 4)
>
> Data                               wide   ->   long
> --------------------------------------------------------------------
---------
> Number of obs.                     1000   ->    4000
> Number of variables                   5   ->       3
> j variable (4 values)                     ->   time
> xij variables:
>                            y1 y2 ... y4   ->   y
> --------------------------------------------------------------------
---------
>
> .
> . reg y time, cluster(id)
>
> Regression with robust standard errors                 Number
> of obs =4000
>                                                        F(
> 1,999) =42518.72
>                                                        Prob >
> F      =0.0000
>
> R-squared     =0.7897
> Number of clusters (id) = 1000                         Root
> MSE      =1.0946
>
> --------------------------------------------------------------------
----------
>              |               Robust
>            y |      Coef.   Std. Err.      t    P>|t|     [95%
> Conf. Interval]
> -------------+------------------------------------------------------
----------
>         time |   1.896646   .0091981   206.20   0.000
> 1.878596 1.914696
>        _cons |  -.9866896   .0358667   -27.51   0.000
> -1.057072 -.916307
> --------------------------------------------------------------------
----------
>
> .
> . tsset id time
>        panel variable:  id, 1 to 1000
>         time variable:  time, 1 to 4
> . iis id
> . tis time
>
> . xtreg y time, fe
>
> Fixed-effects (within) regression               Number of obs
>     =4000
> Group variable (i) : id                         Number of
> groups   =1000
>
> R-sq:  within  = 0.9027                         Obs per group:
> min =4
>        between = 0.0000
> avg =4.0
>        overall = 0.7897
> max =4
>
>                                                 F(1,2999)
>     =27835.94
> corr(u_i, Xb)  = -0.0000                        Prob > F
>     =0.0000
>
> --------------------------------------------------------------------
----------
>            y |      Coef.   Std. Err.      t    P>|t|     [95%
> Conf. Interval]
> -------------+------------------------------------------------------
----------
>         time |   1.896646    .011368   166.84   0.000
> 1.874356 1.918936
>        _cons |  -.9866896   .0311325   -31.69   0.000
> -1.047733 -.9256464
> -------------+------------------------------------------------------
----------
>      sigma_u |  .84490704
>      sigma_e |  .80383774
>          rho |  .52489398   (fraction of variance due to u_i)
> --------------------------------------------------------------------
----------
> F test that all u_i=0:     F(999, 2999) =     4.42
> Prob > F =0.0000
>
> . xtreg y time, re
>
> Random-effects GLS regression                   Number of obs
>     =4000
> Group variable (i) : id                         Number of
> groups   =1000
>
> R-sq:  within  = 0.9027                         Obs per group:
> min =4
>        between = 0.0000
> avg =4.0
>        overall = 0.7897
> max =4
>
> Random effects u_i ~ Gaussian                   Wald chi2(1)
>     =27835.94
> corr(u_i, X)       = 0 (assumed)                Prob > chi2
>     =0.0000
>
> --------------------------------------------------------------------
----------
>            y |      Coef.   Std. Err.      z    P>|z|     [95%
> Conf. Interval]
> -------------+------------------------------------------------------
----------
>         time |   1.896646    .011368   166.84   0.000
> 1.874365 1.918927
>        _cons |  -.9866896   .0390072   -25.30   0.000
> -1.063142 -.9102369
> -------------+------------------------------------------------------
----------
>      sigma_u |  .74318848
>      sigma_e |  .80383774
>          rho |  .46085639   (fraction of variance due to u_i)
> --------------------------------------------------------------------
----------
>
> . prais y time
>
> Number of gaps in sample:  999   (gap count includes panel
> changes)
> (note: computations for rho restarted at each gap)
>
> Iteration 0:  rho = 0.0000
> Iteration 1:  rho = 0.4034
> Iteration 2:  rho = 0.4136
> Iteration 3:  rho = 0.4140
> Iteration 4:  rho = 0.4140
> Iteration 5:  rho = 0.4140
>
> Prais-Winsten AR(1) regression -- iterated estimates
>
>       Source |       SS       df       MS              Number
> of obs =4000
> -------------+------------------------------           F(  1,
> 3998) =8914.61
>        Model |  8913.21448     1  8913.21448           Prob >
> F      =0.0000
>     Residual |  3997.37575  3998  .999843859
> R-squared     =0.6904
> R-squared =0.6903
>        Total |  12910.5902  3999  3.22845467           Root
> MSE      =.99992
>
> --------------------------------------------------------------------
----------
>            y |      Coef.   Std. Err.      t    P>|t|
> [95%Conf. Interval]
> -------------+------------------------------------------------------
----------
>         time |   1.953367   .0157109   124.33   0.000
> 1.922565 1.984169
>        _cons |  -1.061051   .0456138   -23.26   0.000
> -1.15048 -.9716229
> -------------+------------------------------------------------------
----------
>          rho |   .4140074
> --------------------------------------------------------------------
----------
> Durbin-Watson statistic (original)    0.925711
> Durbin-Watson statistic (transformed) 1.672284
>
>
>
>
>
> >
> > >
> > >
> > > Anirban
> > >
> > > ______________________________________
> > > ANIRBAN BASU
> > > Doctoral Student
> > > Harris School of Public Policy Studies
> > > University of Chicago
> > > (312) 563 0907 (H)
> > >
> ________________________________________________________________
> > >
> > >
> > > On Tue, 25 Jun 2002, Mark Schaffer wrote:
> > >
> > > > Hi everybody.
> > > >
> > > > Just a couple of clarifying details on -cluster- vs.
> -xtreg-
> > > and
> > > > Anirban's response to John.
> > > >
> > > > The -cluster- option for -regress- doesn't really impose
> a
> > > particular
> > > > within-cluster correlation structure on the data.  If I
> > > understand it
> > > > correctly, what -cluster- does instead is loosen the
> usual
> > > assumption
> > > > of independence of observations to independence of
> clusters.
> > >  The
> > > > correlation between observations within clusters can be
> > > arbitrary.
> > > > The way this works is basically by treating all the
> > > observations in a
> > > > cluster as a kind of "super-observation" and then
> applying
> > > the robust
> > > > ("sandwich") formula to these super-observations in
> order to
> > >
> > > > calculate the standard errors of the coefficients
> produced
> > > by -
> > > > regress-.  See the manual entry for -regress-, p. 87.
> > > >
> > > > The estimated coefficients (the betas) produced by
> -regress-
> > > are the
> > > > same whether or not the -cluster- option is used; the
> only
> > > thing that
> > > > is different is the standard errors.
> > > >
> > > > With fixed effects, you _do_ impose a particular
> correlation
> > >
> > > > structure, namely all the observations within a cluster
> > > share U(k) in
> > > > Anirban's notation.  If you use -xtreg- with -fe- to
> > > estimate, Stata
> > > > does not, however, use a first-difference estimator - it
> > > uses a fixed
> > > > effects estimator.  In other words, it doesn't
> > > first-difference to
> > > > get rid of the fixed effects, it uses the mean-deviation
>
> > > > transformation to get rid of them.
> > > >
> > > > Hope this helps.
> > > >
> > > > --Mark
> > > >
> > > > Quoting anirban basu <abasu@midway.uchicago.edu>:
> > > >
> > > > > Hi John,
> > > > >
> > > > >
> > > > > With reg command and cluster option, one basically
> imposes
> > > an
> > > > > exchangeable
> > > > > correlation structure on the data. i.e assume corr
> (y(i),
> > > > > y(j)) = rho,
> > > > > where i ne j and  i,j are any two observation from the
> > > same
> > > > > cluster. Rho
> > > > > is constant for every pair of observation within a
> > > cluster.
> > > > > So, one can
> > > > > visuaize it in terms of a random effects model where :
> > > > >
> > > > > Y(k) = Xb + U(k) + e, where k represents clusters and
> U(k)
> > > is
> > > > > a
> > > > > cluster-specific random effect that is common to all
> > > > > observation in that
> > > > > cluster. However, -reg- does not give estimates of
> this
> > > random
> > > > > effect. It
> > > > > just estimates -betas- assuming this structure.
> > > > >
> > > > > However, this estimation is correct only if U(k) are
> > > > > uncorrelated with
> > > > > Xs. i.e. the unobserved characteristics of a cluster
> over
> > > time
> > > > > is
> > > > > uncorrelated with the X over time. If not then fixed
> > > effects
> > > > > is useful.
> > > > >
> > > > >
> > > > > With fixed effects, one evades the correlation problem
> by
> > > > > taking
> > > > > differences. i.e for any cluster k:
> > > > >
> > > > > Y(ik) - Y(1k) = [X(ik) - X(1k)]b + [e(ik) - e(1k)]
> > > > >
> > > > > Note that by taking the difference, the unobserved
> U(k) is
> > > > > eliminated.
> > > > > However, fixed effects assume the U(k) is fixed over
> time
> > > for
> > > > > any cluster
> > > > > k. i.e. the unobserved characteristics of a cluster is
> not
> > > > > changing over
> > > > > time. Also, since we are taking a difference, fixed
> > > effects
> > > > > model cannot
> > > > > estimate the betas for baseline covariates since they
> > > cancel
> > > > > out in the
> > > > > difference.
> > > > >
> > > > > Hope this helps,
> > > > >
> > > > > Anirban
> > > > >
> > > > >
> > > > >
> > > > > ______________________________________
> > > > > ANIRBAN BASU
> > > > > Doctoral Student
> > > > > Harris School of Public Policy Studies
> > > > > University of Chicago
> > > > > (312) 563 0907 (H)
> > > > >
> > >
> ________________________________________________________________
> > > > >
> > > > >
> > > > > On Tue, 25 Jun 2002, John Neumann wrote:
> > > > >
> > > > > > Hello all,
> > > > > >
> > > > > > Since I frequently see panel data questions flying
> > > around
> > > > > the
> > > > > > list, I'm thinking that some of you can provide me
> with
> > > a
> > > > > > very succinct answer to the following question, and
> in
> > > so
> > > > > > doing clarify conceptually for me the data-related
> > > issue:
> > > > > >
> > > > > > I have data on investment products, by year.  Not
> all
> > > > > > products have data in each year.  The dependent
> > > > > > variable is scaled in such a way as to make time
> series
> > > > > > variation in its levels of no concern.  Here's the
> > > question:
> > > > > >
> > > > > > What is the difference between using the reg
> command,
> > > > > > with the robust and cluster option, vs. the xtreg
> > > command
> > > > > > fixed effects model?  The cluster variable using reg
> > > would
> > > > > > naturally be the i( ) parameter for xtreg ...
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > John Neumann
> > > > > > Boston University
> >

Prof. Mark Schaffer
Director, CERT
Department of Economics, School of Management
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008
email: m.e.schaffer@hw.ac.uk
web: http://www.som.hw.ac.uk/ecomes
________________________________________________________________

DISCLAIMER:

This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
________________________________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```