[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Schaffer, Mark E" <M.E.Schaffer@hw.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: When number of regressors greater than the number of clusters in OLS regression |

Date |
Mon, 1 Sep 2008 17:57:07 +0100 |

Divya, > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > Divya Balasubramaniam > Sent: Monday, September 01, 2008 5:35 PM > To: statalist@hsphsun2.harvard.edu > Subject: st: When number of regressors greater than the > number of clusters in OLS regression > > Dear Dr.Schaffer, > > I am using clustering in my analysis and I am having some > trouble understanding some of the important issues. I have > read several papers you have written on clustering issues and > hence I am emailing you to seek help. > > I am doing a district level analysis for the census year > 2001. I have 436 districts in total coming from 17 States. I > run an OLS regression of Share of households having tap water > access on several controls variables (I have about 25 > Regressors). I use the STATA command areg Y on X, > absorb(State) cluster(state). I have the state fixed effects > and clustered by State. > > My question is: I have more regresors(25) than the number of > clusters(17). I also find in the STATA output that I have > F-stat missing. I would like to seek your advice on whether I > can make inference by looking at the individual coefficient > estimates and the reported robust Standard errors. I did see > your comment on this issue on the STATA listserv. However, I > could not find answers as to how to fix this problem of > having more regressors than the number of clusters. I have done a bit of work on this with Austin Nichols. Austin's presentation at the 2007 UK Stata User Group meeting is available here: http://www.stata.com/meeting/13uk/abstracts.html Your question comes up on Statalist from time to time, e.g., http://www.stata.com/statalist/archive/2006-09/msg00840.html Vince Wiggins' posting to Statalist is the most informative one I can think of: http://www.stata.com/statalist/archive/2005-10/msg00594.html The short answer, as I understand it, is that having #regressors > #clusters is not in itself a problem. The problems are, instead: 1. The cluster-robust VCV is asymptotically consistent in the number of clusters. You'd like a big number of clusters so that you can be confident that the asymptotics are kicking in. 17 clusters is not very far on the way to infinity, so the performance of the cluster-robust VCV in your application could be poor. 2. The rank of the cluster-robust VCV is given by the number of clusters. This means you can't test more hypotheses than you have clusters. More generally, testing multiple hypotheses is going to eat up degrees of freedom, and you have very little to spare here (only 17 to start with). Others on the list may also want to comment on this. Cheers, Mark NB: General comment to Statalisters - I couldn't find a Stata FAQ on this. Did I miss it? If not, should there be one? > I will be extremely thankful if you can kindly help me in this regard. > Sincerely, > Divya. > ======================================= > Divya Balasubramaniam > Economics PhD Student > Terry College of Business > University of Georgia > Athens -30602. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Heriot-Watt University is a Scottish charity registered under charity number SC000278. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: When number of regressors greater than thenumber of clusters in OLS regression***From:*Divya Balasubramaniam <divya@uga.edu>

- Prev by Date:
**Re: st: acces data in jackknife command.** - Next by Date:
**Re: st: acces data in jackknife command.** - Previous by thread:
**st: When number of regressors greater than thenumber of clusters in OLS regression** - Next by thread:
**Re: st: When number of regressors greater than the number of clusters in OLS regression** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |