Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Small sample with clustered data

Subject   Re: st: RE: Small sample with clustered data
Date   Wed, 30 Nov 2011 12:47:44 +0100 (CET)

Thanks to you both for your help.


I am not that happy with the 24 country subsample either because it consists
of developed and developing countries for which calculations of the
dependent variable are not identical. When I use a dummy variable doesn't
that "control" for eventual differences between the two subsamples? I assume the
alternative is doing a Chow test? Splitting the sample would be really problematic because
of the seven independent variables and limited degrees of freedom.



-----Ursprüngliche Nachricht-----
Von: "Schaffer, Mark E" <>
Gesendet: 28.11.2011 22:40:58
Betreff: st: RE: Small sample with clustered data

>A few thoughts...
>You say that the values of the dep var for the EU members are
>correlated. But that's not necessarily a problem. What matters for the
>VCE is the correlation of the error term u_i with u_j, or more
>precisely, the correlation of x_i*u_i with x_j*u_j, where x_i is a
>Say, for example, that the true DGP has an EU fixed effect, i.e., an EU
>dummy belongs in the estimating equation. If you estimate without an EU
>dummy, the dependence in the errors can mess up the classical or robust
>VCE. The cluster-robust VCE would deal with this by, in effect,
>creating an aggregated super-observation for the EU in the calculation
>of the VCE. But a much simpler way of dealing with the problem is just
>to include an EU dummy. It's just like estimating a panel data model
>when you expect the observations for a panel unit (country, household,
>whatever) to have errors that are correlated via a fixed effect. You
>could use OLS and cluster-robust SEs, but using the LSDV estimator is
>better, and might on its own be a perfectly satisfactory solution.
>A related thought: you have 24 non-EU countries and 26 EU countries.
>You seem happy with the 24 non-EU sample, and presumably if you were to
>estimate using just these 24, the only thing that would bother you would
>be the small-ish sample size. How do you feel about estimating using a
>sample of just the 26 EU countries? If you feel OK about that as well,
>then perhaps your main concern should be about whether imposing the
>common coefficients assumption for the combined sample of 50 is
>As for the number of clusters issue, you have two problems. First, 25
>clusters isn't very many. The cluster-robust VCE gets its asymptotic
>properties via the number of clusters going off to infinity, and 25
>isn't very far on the way. Second, Austin Nichols' has done some work
>(I think cited in the 2007 presentation you mention) that shows that the
>cluster-robust VCE doesn't work well with very unbalanced panels.
>Knowing only what you've told use about your problem, I'd be reluctant
>to recommend the cluster-robust VCE as the answer. Dealing with the
>problem parametrically (e.g., with an EU dummy) seems like a better way
>to go.
>> -----Original Message-----
>> From:
>> [] On Behalf Of
>> Sent: 28 November 2011 11:24
>> To:
>> Subject: st: Small sample with clustered data
>> Dear Statalist,
>> My sample consists of 50 countries with 26 of them being EU
>> Member States.
>> The problem is that the values of the dependent variable for
>> the EU members are not independent of each other. Thus, I
>> created a dummy variable "eucluster" that indicates if a
>> country is in the EU (1=yes; 0=no) and used the
>> cluster(eucluster) option after the OLS Regressions in Stata
>> 10. However, in "Clustered Errors in Stata"
>> (Nichols/Schaffer 2007 -
>> it is mentioned that if M, the number of clusters, is small
>> matters could even get worse by using the cluster option (Sheet 20).
>> M=50 seems to be the minimum number of clusters required.
>> I have 24 clusters consisting of 1 country and 1 cluster
>> comprising 26 EU members (6 independent variables).
>> I do not know how to deal "correctly" with these clustered
>> data in Stata. Hence, I would highly appreciate if someone
>> could give me advice or suggest a solution on how to deal
>> with the clustered data in such a small sample.
>> Thanks for Consideration.
>> Lars
>> ___________________________________________________________
>> SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
>> kostenguenstig. Jetzt gleich testen!
>> *
>> * For searches and help try:
>> *
>> *
>> *
>Heriot-Watt University is a Scottish charity
>registered under charity number SC000278.
>Heriot-Watt University is the Sunday Times
>Scottish University of the Year 2011-2012
>* For searches and help try:

SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen!

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index