Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Clustering of Standard Errors in a fixed effect model.

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: Clustering of Standard Errors in a fixed effect model.
Date	Mon, 21 Jun 2010 06:31:21 -0700 (PDT)

--- Austin Nichols wrote:
> > The number of clusters and how balanced they are determine the
> > tradeoff--see e.g. http://www.stata.com/meeting/13uk/nichols_crse.pdf
> > and refs therein, and for the follow-up see
> > http://www.stata.com/meeting/boston10/abstracts.html#baum

--- On Mon, 21/6/10, natasha agarwal wrote:
> Thanks Austin. I have read this paper.
> On this note, does it mean that if I have 30 clusters with
> a very unbalanced cluster size like one cluster size being 2000
> observations and the other say 30 observations will give me
> inconsistent results?

I like to use simulations in order to get an idea of how big a 
certain problem is for your data.

1) estimate the model of interest on your data
2) store the parameter of interest, for the purpose of this
   simulation this will be regarded as the "population value".
3) use -bsample- (whith the -cluster()- option) to draw a 
   sample from your "population"
4) rerurn your model of interest on this "sample"
5) test whether your parameter of interest equals the 
   "popalation value"
6) store the p-value (and often it is also interesting to store
   the parameter of interest)
7) repeat steps 3-6 many times, for example using -simulate-, 
   see -help simulate-.

The stored p-values should follow a uniform distribution. This means
that you will reject the true null hypothesis in 5% of the samples if
you choose a significance level of 5%, and in 10% of the samples if 
you choose a siginificance level of 10%, etc. If the p-value does not
follow a uniform distribution then the nominal significance level and
the true rejection rates will not correspond. The logic of a statistical
test (at a 5% significance level) is that a statement is "trustworthy"
because it used a method that will wrongly reject the null-hypothesis in
only 5% of the times that that method is used. So if there are major 
deviations between the nominal significance and true rejection rate
we undermine the logic behind the test. Large deviations form the 
uniform distribution in the p-values correspond to large deviations in
the rejection rate compared to nominal significance levels.

It is often also informative to look at the "sampling distribution" of
the parameter of interest itself. 

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Clustering of Standard Errors in a fixed effect model.
  - From: natasha agarwal <[email protected]>

Prev by Date: AW: st: AW: float to numeric??
Next by Date: Re: AW: st: AW: float to numeric??
Previous by thread: Re: st: Clustering of Standard Errors in a fixed effect model.
Index(es):
- Date
- Thread