Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: cluster

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: cluster
Date	Wed, 27 Feb 2013 10:17:04 -0500

Elin Vimefall <[email protected]>:
Where in reading about clustered SEs did you see a reference to
fixed-effects probit?  If you look at
-help xtprobit- or
http://www.stata.com/help.cgi?xtprobit
you will see that
"There is no command for a conditional fixed-effects model, as there
does not exist a sufficient statistic allowing the fixed effects to be
conditioned out of the likelihood. Unconditional fixed-effects probit
models may be fit with probit command with indicator variables for the
panels. However, unconditional fixed-effects estimates are biased."

That bias is often overstated, and is probably not severe at all if
there are many observations per panel. I think you probably have many
children per district, and can safely include district dummies. You
should probably also report the results of a linear model with and
without district dummies. In all of these cases, you should use the
cluster option if you have a large number of clusters.

In general, a fixed-effects (FE) model can reduce bias in estimated
coefficients while inflating SEs appropriately when panels have
different intercepts u_i (kids in different districts have different
intrinsic likelihoods to be in school in your setting) that are
correlated with other characteristics X_i. Random effects (RE)
estimators do not address this bias, but can improve efficiency of
estimates. Both can increase bias due to measurement error in
predictors.

The reason to cluster SEs is not to get better estimated coefficients,
but to get better estimated variability of the estimated coefficients,
to improve inference without altering the estimated coefficients.
Cluster-robust SEs are clearly better if you have a large number of
clusters (districts, in your case), relative to the number of
coefs/restrictions you want to test, but can perform poorly with a
small number of clusters: see e.g.
http://www.stata.com/meeting/13uk/nichols_crse.pdf
http://www.stata.com/meeting/uk10/UKSUG10.Baum.pdf

On Wed, Feb 27, 2013 at 6:53 AM, Elin Vimefall <[email protected]> wrote:
> Dear list members
>
> To analyze which children that are enrolled in school I use a probit model. To control for the fact that the error terms are correlated at district level I use cluster()
>
> Probit school x1 x2..., cluster(district)
>
> However; I do not really understand how the cluster() works. When reading about clustering I understand that you can do both fixed effects and random  effects. How can I do that in stata, and which of them do cluster do?
>
> Best regards /Elin Vimefall
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: cluster
  - From: Elin Vimefall <[email protected]>

References:
- st: cluster
  - From: Elin Vimefall <[email protected]>

Prev by Date: Re: st: reliability with -icc- and -estat icc-
Next by Date: Re: st: passing argument(s) to Mata constructors
Previous by thread: st: cluster
Next by thread: RE: st: cluster
Index(es):
- Date
- Thread