Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable
Date	Thu, 6 Jan 2011 08:59:29 -0500

Kit-
Useful to think of super-obs but not quite right.  If you have 50
clusters and 100 regressors (with a few thousand obs) but you are only
interested in testing one coefficient, you will typically be fine,
i.e. you will have negligible bias in the SE thus getting correct
inference on average with the CRSE, and it may often be the case that
no alternative approach gets you correct inference (except resampling
clusters for a cluster-robust bootstrap).  So estimating a regression
with 50 obs and 100 coefficients is not quite the right analogy--more
useful to think of the "effective" sample size as between M (number of
clusters) and N (number of obs), computable using "roh" per Kish, L.
(1965), Survey Sampling, New York: Wiley (note that the CRSE is also
the standard svy estimator).

On Thu, Jan 6, 2011 at 8:20 AM, Christopher Baum <[email protected]> wrote:
> <>
> On Jan 6, 2011, at 2:33 AM, Stas wrote:
>
>>
>> There are terrible small sample biases exhibited by -robust- and
>> - -cluster()- standard errors with small # of observations and clusters,
>> respectively. As was noted by Justina, four clusters is SO far away
>> from asymptotics that I wouldn't even consider the clustered standard
>> errors in your situation.
>
> Just to add one thing to Stas', Justina's and Austin's replies... It is useful to think of the cluster-robust VCE estimator generating 'super-observations' , one per cluster. Thus with 4 clusters, you essentially are estimating a model with N=4 to compute the VCE. Some official Stata commands will let you do that, even when the number of coefficients > N. Baum-Schaffer-Stillman -ivreg2- (on SSC) will flag that as a problem, as it does not make much sense to do so. But one of the reasons that a small number of clusters may yield horrible results is that it represents estimation with a very small sample.
>
> Kit

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable
  - From: Christopher Baum <[email protected]>

Prev by Date: Re: Antwort: RE: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable
Next by Date: Re: st: Data Interpolation
Previous by thread: Re: Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable
Next by thread: st: SMCL not showing properly
Index(es):
- Date
- Thread