Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Small sample with clustered data

From   "Schaffer, Mark E" <>
To   <>
Subject   st: RE: Small sample with clustered data
Date   Mon, 28 Nov 2011 21:40:58 -0000


A few thoughts...

You say that the values of the dep var for the EU members are
correlated.  But that's not necessarily a problem.  What matters for the
VCE is the correlation of the error term u_i with u_j, or more
precisely, the correlation of x_i*u_i with x_j*u_j, where x_i is a

Say, for example, that the true DGP has an EU fixed effect, i.e., an EU
dummy belongs in the estimating equation.  If you estimate without an EU
dummy, the dependence in the errors can mess up the classical or robust
VCE.  The cluster-robust VCE would deal with this by, in effect,
creating an aggregated super-observation for the EU in the calculation
of the VCE.  But a much simpler way of dealing with the problem is just
to include an EU dummy.  It's just like estimating a panel data model
when you expect the observations for a panel unit (country, household,
whatever) to have errors that are correlated via a fixed effect.  You
could use OLS and cluster-robust SEs, but using the LSDV estimator is
better, and might on its own be a perfectly satisfactory solution.

A related thought: you have 24 non-EU countries and 26 EU countries.
You seem happy with the 24 non-EU sample, and presumably if you were to
estimate using just these 24, the only thing that would bother you would
be the small-ish sample size.  How do you feel about estimating using a
sample of just the 26 EU countries?  If you feel OK about that as well,
then perhaps your main concern should be about whether imposing the
common coefficients assumption for the combined sample of 50 is

As for the number of clusters issue, you have two problems.  First, 25
clusters isn't very many.  The cluster-robust VCE gets its asymptotic
properties via the number of clusters going off to infinity, and 25
isn't very far on the way.  Second, Austin Nichols' has done some work
(I think cited in the 2007 presentation you mention) that shows that the
cluster-robust VCE doesn't work well with very unbalanced panels.
Knowing only what you've told use about your problem, I'd be reluctant
to recommend the cluster-robust VCE as the answer.  Dealing with the
problem parametrically (e.g., with an EU dummy) seems like a better way
to go.


> -----Original Message-----
> From: 
> [] On Behalf Of 
> Sent: 28 November 2011 11:24
> To:
> Subject: st: Small sample with clustered data
> Dear Statalist,
> My sample consists of 50 countries with 26 of them being EU 
> Member States.
> The problem is that the values of the dependent variable for 
> the EU members are not independent of each other. Thus, I 
> created a dummy variable "eucluster" that indicates if a 
> country is in the EU (1=yes; 0=no) and used the 
> cluster(eucluster) option after the OLS Regressions in Stata 
> 10. However, in "Clustered Errors in Stata"
> (Nichols/Schaffer 2007 - 
> it is mentioned that if M, the number of clusters, is small 
> matters could even get worse by using the cluster option (Sheet 20).
> M=50 seems to be the minimum number of clusters required.
> I have 24 clusters consisting of 1 country and 1 cluster 
> comprising 26 EU members (6 independent variables).
> I do not know how to deal "correctly" with these clustered 
> data in Stata. Hence, I would highly appreciate if someone 
> could give me advice or suggest a solution on how to deal 
> with the clustered data in such a small sample.
> Thanks for Consideration.
> Lars
> ___________________________________________________________
> SMS schreiben mit WEB.DE FreeMail - einfach, schnell und 
> kostenguenstig. Jetzt gleich testen!
> *
> *   For searches and help try:
> *
> *
> *

Heriot-Watt University is a Scottish charity
registered under charity number SC000278.

Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index