# Re: st: When number of regressors greater than the number of clusters in OLS regression

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: When number of regressors greater than the number of clusters in OLS regression Date Mon, 1 Sep 2008 16:13:40 -0400

Divya,
I reread your question and realize that you probably do not have sample data at all. The Census of India was not a sample at all, but, ideally, was a 100% enumeration. (Just as in other countries, this will not be perfectly true.) So, I am not sure that you should be clustering on State, or even on district, for that matter. Please reply with details about your observations. For example, do you have information on individual households or just district totals?

Regards,

Steven

On Sep 1, 2008, at 1:05 PM, Steven Samuels wrote:

More basic questions, Divya: What is your target population: the 17 states (of India, perhaps?) or the entire country? Were the 17 states selected from all states by a sampling process? Or were they chosen in some other way--for example, because they had data available. Are all districts from the selected states in your sample?

-Steven
On Sep 1, 2008, at 12:35 PM, Divya Balasubramaniam wrote:

Dear Dr.Schaffer,

I am using clustering in my analysis and I am having some trouble understanding some of the important issues. I have read several papers you have written on clustering issues and hence I am emailing you to seek help.

I am doing a district level analysis for the census year 2001. I have 436 districts in total coming from 17 States. I run an OLS regression of Share of households having tap water access on several controls variables (I have about 25 Regressors). I use the STATA command areg Y on X, absorb(State) cluster(state). I have the state fixed effects and clustered by State.

My question is: I have more regresors(25) than the number of clusters(17). I also find in the STATA output that I have F-stat missing. I would like to seek your advice on whether I can make inference by looking at the individual coefficient estimates and the reported robust Standard errors. I did see your comment on this issue on the STATA listserv. However, I could not find answers as to how to fix this problem of having more regressors than the number of clusters.

I will be extremely thankful if you can kindly help me in this regard.
Sincerely,
Divya.
=======================================
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```