Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: In this particular case: should I prefer clustering or a random-effects model

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: In this particular case: should I prefer clustering or a random-effects model
Date	Thu, 7 Jul 2011 08:37:29 -0400

Andrea Bennett <[email protected]> :
If you can run fixed effects, that means that treatment was randomly
assigned within class, not across classes, which seems to imply a
violation of SUTVA since you cannot assume a treatment applied to one
kid in a class does not also affect other kids in the class.  That
problem aside, why are fixed effects inappropriate for 36 classes?  If
fixed effects are inappropriate then random effects are inappropriate,
since the random effects estimator is a weighted average of the fixed
effects estimator and another estimator, maintaining the FE assumption
and some additional not-easily-justified assumptions (normality and
uncorrelatedness of the error).  If you can run a RE or multilevel
model, you can also run FE, and the only consideration is the improved
efficiency with the additional maintained assumptions (a major
consideration for typical small experiments, to be sure). The pooled
regression is fine in that case, however, so run -ivreg2- (on SSC, not
just for IV regression) with clustering in each dimension in turn and
then in both dimensions--if standard errors are comparable in each
case, you will feel more confident that the downward bias in the
cluster-robust SE is negligible.  If you have imperfect takeup of a
binary treatment, you can also instrument for receipt of treatment
with treatment assignment using the same program.

If in fact you have a cluster-randomized design, you should have
calculated power (required sample size, minimum detectable effect
size, etc.) in advance assuming the analysis design (pooled, FE,
multilevel hierarchical model, etc.) to be used once data is
collected, using e.g.
http://www.urban.org/publications/1001394.html
or your own custom simulations, so you should not be designing the
analysis after the data has been collected!

On Thu, Jul 7, 2011 at 6:21 AM, Andrea Bennett <[email protected]> wrote:
> Hi
>
> I have student test scores for a generalized test. These tests were conducted at randomly selected high schools, in randomly selected classes with different levels of schooling (still all at high school level).
>
> I want to measure individual performance after a exogenous treatment intervention in half of the sample. From what I understand, I could go with a standard regression and cluster on the class level, or I could use a random-effects model. Using fixed-effects dummies for each class is not appropriate since I only got 36 classes.
>
> I've studied some articles and books but am still not quite sure if I should prefer one over the other (I tend to prefer a random-effects model). Any advice?
>
> In addition, is it advisable to cluster on multiple levels (I have teachers participating with multiple classes, mostly at the same school but in two cases also at different schools). If so, what Stata command should I rely on since xtreg cannot do that?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: In this particular case: should I prefer clustering or a random-effects model
  - From: Andrea Bennett <[email protected]>

References:
- st: In this particular case: should I prefer clustering or a random-effects model
  - From: Andrea Bennett <[email protected]>

Prev by Date: Re: st: Creating household id for groups of persons
Next by Date: Re: st: RE: VEC: Missing Output Data
Previous by thread: st: In this particular case: should I prefer clustering or a random-effects model
Next by thread: Re: st: In this particular case: should I prefer clustering or a random-effects model
Index(es):
- Date
- Thread