Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Small sample with clustered data |

Date |
Tue, 29 Nov 2011 06:19:24 -0500 |

Lars <Lars12398@web.de>: You can estimate the bias in the SE via simulation of data just like yours where you control the correlations and actual treatment effects; if the rejection rate of a nominal 1% test is in the 5% range for some coefficients, and tests for other coefficients (vars with no clustering) have correct size, perhaps you just use a higher standard of "significance" for some coefs than others. You will have to run millions or at least hundreds of thousands of simulations, though, which will take some time... faster to just caveat "significance of results must interpreted with caution" with OIM, het-robust, or cluster-robust SEs. On Mon, Nov 28, 2011 at 4:14 PM, <Lars12398@web.de> wrote: > Dear Austin, > > thank you for your reply. If I understand you correct, > you suggest to use cluster(countryid) after the regression, while > controlling for euclus. Countryid is a number from 1 to 50. This works. > The results are the same as if I use the robust option after the regression. > So do you think this is the best option and I should state that SE are > probably biased downward and thus significant results have to be interpreted with caution? > What if the coefficients are still significant even though I do not use the cluster option? Is there a way > to estimate the bias? > > Best > > Lars > > > -----Ursprüngliche Nachricht----- > Von: "Austin Nichols" <austinnichols@gmail.com> > Gesendet: 28.11.2011 20:00:41 > An: statalist@hsphsun2.harvard.edu > Betreff: Re: st: Small sample with clustered data > >>Lars <Lars12398@web.de>: >>You are likely to have SEs biased downward no matter what you do, if >>you use the 24 cluster design--can you cluster by country (50 >>clusters) but include eucluster as an explanatory variable? >> >>On Mon, Nov 28, 2011 at 6:24 AM, <Lars12398@web.de> wrote: >>> Dear Statalist, >>> >>> My sample consists of 50 countries with 26 of them being EU Member States. >>> The problem is that the values of the dependent variable for the EU members are not >>> independent of each other. Thus, I created a dummy variable "eucluster" that indicates >>> if a country is in the EU (1=yes; 0=no) and used the cluster(eucluster) option after the >>> OLS Regressions in Stata 10. However, in "Clustered Errors in Stata" >>> (Nichols/Schaffer 2007 -http://repec.org/usug2007/crse.pdf) it is mentioned that if M, >>> the number of clusters, is small matters could even get worse by using the cluster option (Sheet 20). >>> M=50 seems to be the minimum number of clusters required. >>> >>> I have 24 clusters consisting of 1 country and 1 cluster comprising 26 EU members (6 independent variables). >>> I do not know how to deal "correctly" with these clustered data in Stata. Hence, I would highly appreciate if someone could >>> give me advice or suggest a solution on how to deal with the clustered data in such a small sample. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Small sample with clustered data***From:*Lars12398@web.de

- Prev by Date:
**st: RE: Winsorize by time and group** - Next by Date:
**st: starting values invalid** - Previous by thread:
**Re: st: Small sample with clustered data** - Next by thread:
**st: nested logit tree** - Index(es):