Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: cluster

From   Elin Vimefall <>
To   "" <>
Subject   RE: st: cluster
Date   Thu, 28 Feb 2013 08:29:19 +0000

Thanks a lot! 
The references really helped me. 
However; I still have one question which I can not solve. 
When writing the formula for the robust (clustered) variance estimator I want to incorporate weights. Does anyone know how to do that? 

Best regards
/Elin Vimefall

-----Original Message-----
From: [] On Behalf Of Austin Nichols
Sent: den 27 februari 2013 16:17
Subject: Re: st: cluster

Elin Vimefall <>:
Where in reading about clustered SEs did you see a reference to fixed-effects probit?  If you look at -help xtprobit- or
you will see that
"There is no command for a conditional fixed-effects model, as there does not exist a sufficient statistic allowing the fixed effects to be conditioned out of the likelihood. Unconditional fixed-effects probit models may be fit with probit command with indicator variables for the panels. However, unconditional fixed-effects estimates are biased."

That bias is often overstated, and is probably not severe at all if there are many observations per panel. I think you probably have many children per district, and can safely include district dummies. You should probably also report the results of a linear model with and without district dummies. In all of these cases, you should use the cluster option if you have a large number of clusters.

In general, a fixed-effects (FE) model can reduce bias in estimated coefficients while inflating SEs appropriately when panels have different intercepts u_i (kids in different districts have different intrinsic likelihoods to be in school in your setting) that are correlated with other characteristics X_i. Random effects (RE) estimators do not address this bias, but can improve efficiency of estimates. Both can increase bias due to measurement error in predictors.

The reason to cluster SEs is not to get better estimated coefficients, but to get better estimated variability of the estimated coefficients, to improve inference without altering the estimated coefficients.
Cluster-robust SEs are clearly better if you have a large number of clusters (districts, in your case), relative to the number of coefs/restrictions you want to test, but can perform poorly with a small number of clusters: see e.g.

On Wed, Feb 27, 2013 at 6:53 AM, Elin Vimefall <> wrote:
> Dear list members
> To analyze which children that are enrolled in school I use a probit 
> model. To control for the fact that the error terms are correlated at 
> district level I use cluster()
> Probit school x1 x2..., cluster(district)
> However; I do not really understand how the cluster() works. When reading about clustering I understand that you can do both fixed effects and random  effects. How can I do that in stata, and which of them do cluster do?
> Best regards /Elin Vimefall
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index