[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: When number of regressors greaterthan the number of clusters in OLS regression

From   Divya Balasubramaniam <[email protected]>
To   [email protected]
Subject   Re: st: When number of regressors greaterthan the number of clusters in OLS regression
Date   Mon, 1 Sep 2008 21:08:15 -0400 (EDT)

Thank you all for your invaluable suggestions. I really appreciate it.


---- Original message ----
>Date: Mon, 1 Sep 2008 19:19:48 -0400
>From: Steven Samuels <[email protected]>  
>Subject: Re: st: When number of regressors greater than the number of clusters in OLS regression  
>To: [email protected]
>Thanks Mark. I've been thinking that the data were not *sampled* as  
>clusters. Since they were not, I erroneously assumed that there would  
>not be cluster effects. I agree clustered effects should be  
>considered. As Vince Wiggins stated in 
>archive/2005-10/msg00594.html , "We can use the [robust] covariance  
>matrix to test any subset of joint hypotheses that does not exceed  
>its rank." Thus Divya can get valid standard errors for single  
>coefficients, if she adds states as clusters, and can probably make  
>most of the inferences she is interested in.
>-xtreg- offers some intriguing possibilities, for it would  
>distinguish between state-level and district-level predictors of the  
>same kind. Of course statistics from neighboring districts may be  
>spatially correlated, opening up a completely different area of  
>Perhaps the best advice to Divya that I can give, in addition to Mark's:
>Clarify your purpose--is the study exploratory ("find a good  
>predictive model")? Or are you testing hypotheses about certain  
>predictors? If your analysis is exploratory, consider holding out a  
>random set of districts or states on which to test the fit of your  
>"best" models. If you are interested in certain predictors, than  
>others are potential effect modifiers and confounders. You probably  
>don't need them all. Do you have 25 predictors because you know they  
>are all important from other studies?  The more unnecessary  
>predictors you have in one model, the more difficult it will be to  
>tease out the truly important ones.
>On Sep 1, 2008, at 6:00 PM, Schaffer, Mark E wrote:
>> Whether or not you need to use cluster-robust depends on whether you
>> think your data have a problem that cluster-robust can address, namely
>> (1) the error terms in your equation are correlated within states
>> because of unobserved heterogeneity (so the iid assumption fails), but
>> (2) the error terms are not correlated across states.
>> A good example would be whether you are looking at something that is
>> affected by state-level regulation, i.e., the laws regulating it vary
>> from state to state, but you don't have variables that control for  
>> this
>> somehow.
>*   For searches and help try:
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index