Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Small sample with clustered data

From	"Schaffer, Mark E" <[email protected]>
To	<[email protected]>
Subject	RE: st: RE: Small sample with clustered data
Date	Wed, 30 Nov 2011 13:19:45 -0000

Lars,

I suppose my point is that there are different kinds of differences between subsamples, with different consequences, and because of the limited sample size you have scope for addressing some of this issues but not others.

My guess is that the most you can reasonably do is have a small number of country group dummies to capture differences in the mean across groups.  It's possible that just this is enough to adddress your concerns about clustering and SEs etc.

--Mark

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> [email protected]
> Sent: 30 November 2011 11:48
> To: [email protected]
> Subject: Re: st: RE: Small sample with clustered data
> 
> Thanks to you both for your help.
> 
> @Mark
> 
> I am not that happy with the 24 country subsample either 
> because it consists of developed and developing countries for 
> which calculations of the dependent variable are not 
> identical. When I use a dummy variable doesn't that "control" 
> for eventual differences between the two subsamples? I assume 
> the alternative is doing a Chow test? Splitting the sample 
> would be really problematic because of the seven independent 
> variables and limited degrees of freedom.
> 
> 
> Best
> 
> Lars
> 
> -----Ursprüngliche Nachricht-----
> Von: "Schaffer, Mark E" <[email protected]>
> Gesendet: 28.11.2011 22:40:58
> An: [email protected]
> Betreff: st: RE: Small sample with clustered data
> 
> >Lars,
> >
> >A few thoughts...
> >
> >You say that the values of the dep var for the EU members are 
> >correlated. But that's not necessarily a problem. What 
> matters for the 
> >VCE is the correlation of the error term u_i with u_j, or more 
> >precisely, the correlation of x_i*u_i with x_j*u_j, where x_i is a 
> >regressor.
> >
> >Say, for example, that the true DGP has an EU fixed effect, 
> i.e., an EU 
> >dummy belongs in the estimating equation. If you estimate 
> without an EU 
> >dummy, the dependence in the errors can mess up the 
> classical or robust 
> >VCE. The cluster-robust VCE would deal with this by, in effect, 
> >creating an aggregated super-observation for the EU in the 
> calculation 
> >of the VCE. But a much simpler way of dealing with the 
> problem is just 
> >to include an EU dummy. It's just like estimating a panel data model 
> >when you expect the observations for a panel unit (country, 
> household,
> >whatever) to have errors that are correlated via a fixed effect. You 
> >could use OLS and cluster-robust SEs, but using the LSDV 
> estimator is 
> >better, and might on its own be a perfectly satisfactory solution.
> >
> >A related thought: you have 24 non-EU countries and 26 EU countries.
> >You seem happy with the 24 non-EU sample, and presumably if 
> you were to 
> >estimate using just these 24, the only thing that would bother you 
> >would be the small-ish sample size. How do you feel about estimating 
> >using a sample of just the 26 EU countries? If you feel OK 
> about that 
> >as well, then perhaps your main concern should be about whether 
> >imposing the common coefficients assumption for the combined 
> sample of 
> >50 is warranted.
> >
> >As for the number of clusters issue, you have two problems. 
> First, 25 
> >clusters isn't very many. The cluster-robust VCE gets its asymptotic 
> >properties via the number of clusters going off to infinity, and 25 
> >isn't very far on the way. Second, Austin Nichols' has done 
> some work 
> >(I think cited in the 2007 presentation you mention) that shows that 
> >the cluster-robust VCE doesn't work well with very unbalanced panels.
> >Knowing only what you've told use about your problem, I'd be 
> reluctant 
> >to recommend the cluster-robust VCE as the answer. Dealing with the 
> >problem parametrically (e.g., with an EU dummy) seems like a 
> better way 
> >to go.
> >
> >HTH,
> >Mark
> >
> >> -----Original Message-----
> >> From: [email protected]
> >> [mailto:[email protected]] On Behalf Of 
> >> [email protected]
> >> Sent: 28 November 2011 11:24
> >> To: [email protected]
> >> Subject: st: Small sample with clustered data
> >>
> >> Dear Statalist,
> >>
> >> My sample consists of 50 countries with 26 of them being EU Member 
> >> States.
> >> The problem is that the values of the dependent variable 
> for the EU 
> >> members are not independent of each other. Thus, I created a dummy 
> >> variable "eucluster" that indicates if a country is in the 
> EU (1=yes; 
> >> 0=no) and used the
> >> cluster(eucluster) option after the OLS Regressions in Stata 10. 
> >> However, in "Clustered Errors in Stata"
> >> (Nichols/Schaffer 2007 -http://repec.org/usug2007/crse.pdf)
> >> it is mentioned that if M, the number of clusters, is 
> small matters 
> >> could even get worse by using the cluster option (Sheet 20).
> >> M=50 seems to be the minimum number of clusters required.
> >>
> >> I have 24 clusters consisting of 1 country and 1 cluster 
> comprising 
> >> 26 EU members (6 independent variables).
> >> I do not know how to deal "correctly" with these clustered data in 
> >> Stata. Hence, I would highly appreciate if someone could give me 
> >> advice or suggest a solution on how to deal with the 
> clustered data 
> >> in such a small sample.
> >>
> >> Thanks for Consideration.
> >>
> >> Lars
> >> ___________________________________________________________
> >> SMS schreiben mit WEB.DE FreeMail - einfach, schnell und 
> >> kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192
> >> *
> >> * For searches and help try:
> >> * http://www.stata.com/help.cgi?search
> >> * http://www.stata.com/support/statalist/faq
> >> * http://www.ats.ucla.edu/stat/stata/
> >>
> >
> >
> >--
> >Heriot-Watt University is a Scottish charity registered 
> under charity 
> >number SC000278.
> >
> >Heriot-Watt University is the Sunday Times Scottish 
> University of the 
> >Year 2011-2012
> >
> >
> >
> >*
> >* For searches and help try:
> >* http://www.stata.com/help.cgi?search
> >* http://www.stata.com/support/statalist/faq
> >* http://www.ats.ucla.edu/stat/stata/
> 
> 
> ___________________________________________________________
> SMS schreiben mit WEB.DE FreeMail - einfach, schnell und 
> kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-- 
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.

Heriot-Watt University is the Sunday Times
Scottish University of the Year 2011-2012



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: RE: Small sample with clustered data
  - From: [email protected]

Prev by Date: Re: st: Rolling Means and Standard Deviations
Next by Date: Re: st: Working with complex strings
Previous by thread: Re: st: RE: Small sample with clustered data
Next by thread: st: Rolling Means and Standard Deviations
Index(es):
- Date
- Thread