Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: cluster


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: cluster
Date   Wed, 27 Feb 2013 10:17:04 -0500

Elin Vimefall <Elin.Vimefall@oru.se>:
Where in reading about clustered SEs did you see a reference to
fixed-effects probit?  If you look at
-help xtprobit- or
http://www.stata.com/help.cgi?xtprobit
you will see that
"There is no command for a conditional fixed-effects model, as there
does not exist a sufficient statistic allowing the fixed effects to be
conditioned out of the likelihood. Unconditional fixed-effects probit
models may be fit with probit command with indicator variables for the
panels. However, unconditional fixed-effects estimates are biased."

That bias is often overstated, and is probably not severe at all if
there are many observations per panel. I think you probably have many
children per district, and can safely include district dummies. You
should probably also report the results of a linear model with and
without district dummies. In all of these cases, you should use the
cluster option if you have a large number of clusters.

In general, a fixed-effects (FE) model can reduce bias in estimated
coefficients while inflating SEs appropriately when panels have
different intercepts u_i (kids in different districts have different
intrinsic likelihoods to be in school in your setting) that are
correlated with other characteristics X_i. Random effects (RE)
estimators do not address this bias, but can improve efficiency of
estimates. Both can increase bias due to measurement error in
predictors.

The reason to cluster SEs is not to get better estimated coefficients,
but to get better estimated variability of the estimated coefficients,
to improve inference without altering the estimated coefficients.
Cluster-robust SEs are clearly better if you have a large number of
clusters (districts, in your case), relative to the number of
coefs/restrictions you want to test, but can perform poorly with a
small number of clusters: see e.g.
http://www.stata.com/meeting/13uk/nichols_crse.pdf
http://www.stata.com/meeting/uk10/UKSUG10.Baum.pdf

On Wed, Feb 27, 2013 at 6:53 AM, Elin Vimefall <Elin.Vimefall@oru.se> wrote:
> Dear list members
>
> To analyze which children that are enrolled in school I use a probit model. To control for the fact that the error terms are correlated at district level I use cluster()
>
> Probit school x1 x2..., cluster(district)
>
> However; I do not really understand how the cluster() works. When reading about clustering I understand that you can do both fixed effects and random  effects. How can I do that in stata, and which of them do cluster do?
>
> Best regards /Elin Vimefall
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index