Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters


From   "Schaffer, Mark E" <[email protected]>
To   <[email protected]>
Subject   RE: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters
Date   Wed, 1 Mar 2006 16:49:28 -0000

Daniel,

This is a tricky question, at least for me, and I don't know the
complete answer.

The situation you describe is definitely a problem if you want to test
lots of parameter restrictions.  If you try, say, to test the joint
significance of all your regressors, you will fail, because you have
more (restrictions on) regressors than clusters.  You will probably also
see that the F statistic automatically reported by areg or xtreg is
missing and highlighted in blue, and if you click on it you'll get a
longish discussion that includes the following:

"There is no mechanical problem with your model, but you need to
consider carefully whether any of the reported standard errors mean
anything.  The theory that justifies the standard error calculation is
asymptotic in the number of clusters, and we have just established that
you are estimating at least as many parameters as you have clusters.

Putting that concern aside, the model test statistic issue is that you
cannot simultaneously test that all coefficients are zero because there
is insufficient information.  You could test a subset, but not all, and
so Stata refuses to report the overall model test statistic."

The full help message is available as -help j_robustsingular-.

However ... there is some ambiguity in the statement above, since it
implies that it's *possible* that none of the SEs mean anything.  I used
to think this was automatically the case if the cluster-robust var-cov
matrix is not full rank, but now I'm not sure.  It may be the case that,
for example, you can still get valid tests of one or a few coefficients
even if you can't test them all jointly.  I've been meaning to go
searching through the literature to find the references on this but
haven't had the time....

Cheers,
Mark

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> Daniel Simon
> Sent: 01 March 2006 15:54
> To: [email protected]
> Subject: RE: st: fixed effects with clustering when the 
> number of levels of variable to be absorbed exceeds number of clusters
> 
> Mark - thanks, this is very helpful, as usual. Now, I have a 
> follow-up. If, in addition to the set of fixed effects that I 
> am absorbing, I have another set of dummies that I am 
> including manually with i. and there about as many of these 
> i.fixed effects as there are clusters, then this will pose a 
> problem. Is that correct? For example, if in my individual 
> fixed effects model where I cluster on state, I also want to 
> include fixed effects for age (e.g. a separate dummy for each 
> value of age in years in my dataset), and I have forty 
> different age dummies,  then the number of age dummies is 
> close to the number of clusters.  In this situation, is there 
> some way to assess whether the estimates of the std errors 
> are problematic? and, is there some alternative way to proceed?
> 
> Thanks again. Daniel
> 
> At 03:19 PM 3/1/2006 +0000, you wrote:
> >Daniel,
> >
> >What you need to be aware of is that the asymptotics justifying the 
> >cluster-robust estimator requires the number of clusters to 
> go off to 
> >infinity.  I don't think Austin's comment is quite right, at 
> least in 
> >the context you've cited it.  The number of fixed effects 
> can be much 
> >bigger than the number of clusters, and that won't by itself cause a 
> >problem - after all, the fixed effects are not actually 
> being estimated.
> >What *will* cause problems is if you have very few clusters, esp. if 
> >compared to the number of parameters that you *are* estimating.  In 
> >your example, you want to cluster by state.  50 is not very 
> far on the 
> >way to infinity, but maybe it's enough for your purposes.  
> But if you 
> >also have lots of parameters that you want to test, then you 
> will start 
> >running into serious problems (nb: the rank of the cluster-robust 
> >var-cov matrix is equal to the number of clusters minus the 
> number of 
> >estimated parameters).
> >
> >Hope this helps.
> >
> >--Mark
> >
> > > -----Original Message-----
> > > From: [email protected]
> > > [mailto:[email protected]] On Behalf Of Daniel 
> > > Simon
> > > Sent: 01 March 2006 15:02
> > > To: [email protected]
> > > Subject: Re: st: fixed effects with clustering when the number of 
> > > levels of variable to be absorbed exceeds number of clusters
> > >
> > > Sorry - I made a mistake in the subject line of my last 
> message. It 
> > > is now correct. Daniel
> > >
> > > At 09:59 AM 3/1/2006 -0500, you wrote:
> > > >Hi Austin - thanks for pointing out that "the number of 
> levels of 
> > > >the
> > > >absorb() variable should not exceed the number of clusters."
> > > I have two
> > > >questions about this: (1) I assume that the same holds true for 
> > > >xtreg,fe with clustering (given that this yields identical
> > > std errors
> > > >to areg with clustering). Is this assumption correct? (2)
> > > Does anyone
> > > >have suggestions for the most efficient way to estimate
> > > fixed-effects
> > > >models with clustering when there are thousands of fixed effects 
> > > >but clustering occurs on a variable with many fewer units? For
> > > example, if
> > > >I have a panel dataset tracking thousands of individuals
> > > over time and
> > > >I want to examine the impact of a state policy variable,
> > > then I would
> > > >want to estimate a model with individual fixed effects but I
> > > would also want to cluster by state.
> > > >What would be a sensible way to proceed in this situation?
> > > >
> > > >Thanks. Daniel
> > > >
> > > >At 02:06 PM 2/28/2006 -0500, you wrote:
> > > >>Perhaps I should ignore this question in the same way you
> > > have ignored
> > > >>the advice in the Statalist FAQ on how to write a
> > > well-formed question
> > > >>(in particular, you give no indication what command you
> > > used or what
> > > >>error message you got, much less show us the output), but
> > > you should
> > > >>certainly read:
> > > >>       -help xtreg- -help xtdata- and -help areg- for
> > > starters.  Note
> > > >>also that you may want to cluster on id, assuming your
> > > fixed effects
> > > >>are individual id and year effects, to allow for 
> arbitrary serial 
> > > >>correlation within panel, and -cluster- implies -robust-.
> > > But see the
> > > >>various FAQs on the subject, and such advice as appears in the 
> > > >>relevant help files, e.g.
> > > >>   Note: Exercise caution when using the cluster() 
> option with areg.
> > > >>         The effective number of degrees of freedom for the
> > > robust variance
> > > >>         estimator is (n_g - 1), where n_g is the number of
> > > clusters.  Thus
> > > >>         the number of levels of the absorb() variable
> > > should not exceed the
> > > >>         number of clusters.
> > > >>
> > > >>On 2/28/06, Yasmine Kent <[email protected]> wrote:
> > > >> > Hi,
> > > >> >
> > > >> > Apologies if this is a basic question...
> > > >> >
> > > >> > I would like to obtain ROBUST standard errors and
> > > t-statistics in a
> > > >> > panel data regression that I am running (with 2-way
> > > fixed effects).
> > > >> > The 'robust'
> > > >> > command does not appear to work with panel data, it
> > > gives an error
> > > >> > message. Theoretically, I thought that it should be
> > > possible to get
> > > >> > these. Is there another command I should use instead? (I
> > > am using
> > > >> > Stata 8).
> > > >> >
> > > >> > Thank you!
> > > >> > Yasmine
> > > >>
> > > >>*
> > > >>*   For searches and help try:
> > > >>*   http://www.stata.com/support/faqs/res/findit.html
> > > >>*   http://www.stata.com/support/statalist/faq
> > > >>*   http://www.ats.ucla.edu/stat/stata/
> > > >
> > > >Daniel Simon
> > > >Assistant Professor
> > > >Department of Applied Economics and Management Cornell University
> > > >(607) 255-1626
> > > >*
> > > >*   For searches and help try:
> > > >*   http://www.stata.com/support/faqs/res/findit.html
> > > >*   http://www.stata.com/support/statalist/faq
> > > >*   http://www.ats.ucla.edu/stat/stata/
> > >
> > > Daniel Simon
> > > Assistant Professor
> > > Department of Applied Economics and Management Cornell University
> > > (607) 255-1626
> > >
> > > *
> > > *   For searches and help try:
> > > *   http://www.stata.com/support/faqs/res/findit.html
> > > *   http://www.stata.com/support/statalist/faq
> > > *   http://www.ats.ucla.edu/stat/stata/
> > >
> > >
> >
> >*
> >*   For searches and help try:
> >*   http://www.stata.com/support/faqs/res/findit.html
> >*   http://www.stata.com/support/statalist/faq
> >*   http://www.ats.ucla.edu/stat/stata/
> 
> Daniel Simon
> Assistant Professor
> Department of Applied Economics and Management Cornell University
> (607) 255-1626 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index