Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: determining standard errors of fixed effect terms after xtreg


From   "Sarah Cohodes" <[email protected]>
To   [email protected]
Subject   Re: st: determining standard errors of fixed effect terms after xtreg
Date   Mon, 11 Feb 2008 18:29:49 -0500

Hi Austin, and all,

Thanks so much for your informative response, and apologies for only
getting back to the list now.

I implemented your first solution on the sample data, and
unsurprisingly, it worked perfectly. Interestingly, when implemented
in my "real" data, the FE estimated by -predict- is not the same as
the FE estimated by taking the demeaned betas from the dummy variables
(it is the same in bpwide.dta).  I find this slightly disturbing.

In this particular case, I have 600 groups (M) which vary in size from
a couple dozen to over 1,000.

As for Rothstein, in case anyone else is interested, I'm not sure if
my situation is analogous, as I'm looking at school fixed effects, not
teacher fixed effects, but am definitely considering his critiques.

Thanks again to Austin.

Sarah Cohodes

---------- Forwarded message ----------
From: Austin Nichols <[email protected]>
Date: Feb 1, 2008 12:32 PM
Subject: Re: st: determining standard errors of fixed effect terms after xtreg
To: [email protected]


Sarah--
I haven't seen an answer on the list (let us know if you get any
answers offlist), so I'll venture one. It may not be the one you were
looking for.  In short, if you want the standard errors of the fixed
effects terms, put the fixed effects in as dummy variables.

sysuse bpwide, clear
qui xtreg bp_after bp_before sex, fe i(agegrp) robust
predict fe_agegrp, u
qui tab agegrp, gen(d)
qui reg bp_after bp_before sex d*, nocons robust
g fe=.
g se=.
forv i=1/3 {
 replace fe=_b[d`i'] if ageg==`i'
 replace se=_se[d`i'] if ageg==`i'
 }
su fe, meanonly
g fe_demeaned=fe-r(mean)
tab fe_*

If there are too many dummy variables in such a model to get
estimates, the standard errors are probably meaningless anyway.  Note
that if you have panel data with N individuals in M groups over T time
periods and you want to estimate M fixed effects at the group level,
but you worry about serial correlation or other intra-group
correlation and so cluster at the group level, the assumption is that
M is close to infinity (or close enough) and there is no way to get
good estimates of the sampling variation of M coefficients on dummy
variables.

Without clustering, if N=a*M, for some constant a, or nearly, e.g.
roughly 25 kids per class, the asymptotic justification for standard
errors means that as your data is getting infinitely large (or close
enough) you are estimating infinitely many parameters (or close enough
to infinity to be computationally infeasible) and the sampling
variation of those parameter estimates is not going to be consistently
estimated (since N increases at the same rate as M).  However, if 25
is close enough to infinity for our purposes, we can get reasonable
estimates, I suppose.  I.e. we need the value of a to be "close" to
infinity, regardless of N or M.

That said, if you had no other X vars (only fixed effects), you could
calculate the VCE by hand.  This is easiest when you assume i.i.d.
errors. The inverse of X'X when X is a bunch of dummies is just a MxM
diagonal matrix with [1/m_i] on the diagonal where m_i is the number
of cases in each group i.  Since the VCE is the mean squared error
times n/(n-k) times the inverse of X'X you can get the standard errors
like so (the direct comparison is possible with 3 fixed effects, but
not with many thousands, of course):

sysuse bpwide, clear
qui xtreg bp_after, fe i(agegrp)
loc n=e(N)
loc k=e(df_m)+1
predict double e if e(sample), e
g double e2=e^2
su e2, meanonly
loc e2=r(mean)
levelsof agegrp, loc(levs)
foreach v of local levs {
 qui count if e<. & agegrp==`v'
 mat v=nullmat(v),1/r(N)
 }
mat v=diag(v')
mat v=`e2'*`n'/(`n'-`k')*v
mat li v
qui tab agegrp, gen(d)
qui reg bp_after d*, nocons
mat li e(V)

If the other X vars were uncorrelated with the fixed effects, you
could extend this approach to include other X vars, I think, but that
is unlikely to be the case in practice.

A related note: You've seen Rothstein (2007), right?  I mean
http://www.princeton.edu/~jrothst/workingpapers/rothstein_VAM_20071120.pdf
(referenced without a link in
http://www-personal.umich.edu/~nicholsa/ciwod.pdf) which impugns these
FE models for teacher effects on test score data.  Rothstein finds
that 5th grade teachers have similar size "effects" on 4th grade
scores as 4th grade teachers have (i.e. the SD across teachers of
estimated fixed effects is of comparable magnitude for both the
teachers who can have a causal effect and those who can't), which
means either (1) students are tracked based on scores to different
kinds of teachers/classes, which means we can't interpret the
"effects" as causal, or (2) the measured "effects" are just noise, for
both 4th and 5th grade teachers.  Or some convex combination of these
explanations. I think.  You might want to read the paper yourself.

Am I interpreting your example right, that you are specifying a FE
model with a lagged dependent variable on the right hand side?  Do you
want -xtabond- by chance?  Then you'd have even more trouble
recovering FE and SEs, I guess...


On Jan 29, 2008 12:49 PM, Sarah Cohodes <[email protected]> wrote:
> Dear Statalisters,
>
> I am trying to obtain the standard errors of the fixed effects terms
> after -xtreg-.
>
> Using an analogous dataset to mine, obtaining the fe terms is easy:
>
> sysuse bpwide, clear
> xtreg bp_after bp_before sex, fe i(agegrp) robust
> predict fe_agegrp, u
>
> However, I don't know how to proceed to obtain the standard errors of
> these fixed effects.
>
> I appreciate your time and effort in advance.  I noticed that this
> question has been asked but not answered on Statalist previously (see:
> http://www.stata.com/statalist/archive/2007-01/msg01046.html or:
> http://www.stata.com/statalist/archive/2006-04/msg00026.html).
> I am loathe to revisit a topic that was not previously of interest or
> perhaps has a trivial answer that I've overlooked, however I am quite
> stuck.
>
> Again, thanks,
> Sarah
>
> ********************************
> Sarah Cohodes
> Data Manager and Research Analyst
> Project for Policy Innovation in Education
> Harvard Graduate School of Education
> 617.496.3408 (phone)
> 617.495.2614 (fax)
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index