[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: cluster and F test |

Date |
Sun, 6 Jul 2008 10:05:23 -0400 |

sara borelli <saraborelli77@yahoo.it> : An individual SE may be OK, in the sense that a test involving only one coef may have approximately the right size, but e(V) has rank M-1 and so the upper limit on the number of coefs that can included in one joint test is M-1. The reported SEs ignore the cov between the 37 estimates; they offer a test of one coef each, ignoring the fact that you can't actually test all 37, or even 14, jointly. But in this case, a test of even one coef is suspect, because you have M-1=13 which is a very small number to consider close to infinity. 50 clusters, or at the very least 20 large balanced clusters, are needed to be reasonably sure the size distortion is not too large. In general, it probably seems like a bad idea to include more variables than you have effective df, though for the CRSE, Stata will let you do it, for various reasons. For example, if you had 50 clusters, 50 fixed effects for cluster and 120 fixed effects for time, you could include these 170 effects as 168 dummy variables along with one explanatory variable of interest. You can never test the joint sig of the cluster FE nor the joint sig of the time FE, and you will (one hopes) not be testing smaller groups of these FE either, so the only test you plan to do in this case is on the one explanatory variable of interest, with 49 df. In this case, you should be fine. Note the relevant number is M-k, number of clusters less number of constraints. --Austin On Sun, Jul 6, 2008 at 5:14 AM, sara borelli <saraborelli77@yahoo.it> wrote: > Hi [Austin], > thank you very much for your help. > > When I test (with the F-test) 18 restrictions with 14 clusters stata drops the 5 constraints because, as you said, it can test only 13 constraints. > There is something I do not understand, however. With the cluster option the number of observations useful to estimate the standard errors becomes the number of clusters, 14. Thus, if I have 37 standard errors to estimate and only 14 clusters, how is that possible that stata is able to estimate all the standar errors, but still test only 13 constraints? > Basically, when the number of clusters is smaller than the number of regressors, is only the F-test computed in a wrong way or also the standar errors? > I am sorry to keep usking about this, but I ma a bit confused > Thank you > Sara > > --- Sab 5/7/08, Austin Nichols <austinnichols@gmail.com> ha scritto: > >> Da: Austin Nichols <austinnichols@gmail.com> >> Oggetto: Re: st: cluster and F test >> A: statalist@hsphsun2.harvard.edu >> Data: Sabato 5 luglio 2008, 19:01 >> sara borelli <saraborelli77@yahoo.it>: >> The cluster-robust standard error (CRSE) estimator has at >> most M-1 df >> with M clusters, so with 14 clusters you can test the joint >> sig. of at >> most 13 coefs. But the performance of the estimator gets >> worse as you >> increase the the number of constraints. The CRSE's >> performance >> improves as M-k increases toward infinity, where M is the >> number of >> clusters and k the number of constraints you are testing, >> and for M-k >> at least 20 and clusters balanced you should expect good >> performance. >> Since you have M-k equal to one (the minimum possible >> value), you >> should expect that the estimated variance is too low and >> the F stat is >> too high, on average. Note that clusters are like >> super-observations, >> for the purposes of the SE of estimated coefs, so a >> regression on 37 >> variables with 14 clusters is a bit like a regression on 37 >> vars with >> 14 obs--you really don't want to test more than one >> coef there, and >> maybe not even that many. How are your clusters defined? >> Is there >> any possibility of adding more clusters, or redefining them >> sensibly >> so you have more clusters? >> >> On Fri, Jul 4, 2008 at 5:16 AM, sara borelli >> <saraborelli77@yahoo.it> wrote: >> > Dear Stata List members, >> > >> > I have found some related questions on FAQs, but I >> cannot fins exactly what I need. >> > I am running a regression with the cluster option. I >> have 37 independent variables (including the constant), >> 1647 observations, and 14 clusters. >> > I want to test the joint significance of 18 variables. >> > If I do NOT use the cluster option the F is calculated >> correctly as F(18, 1637). >> > But once I introduce the cluster option I get the >> following result: >> > (1) x1= 0 >> > (2) x2 = 0 >> > (3) x3 = 0 >> > (4) x3 = 0 >> > ... >> > (18) x18 = 0 >> > Constraint 1 dropped >> > Constraint 2 dropped >> > Constraint 3 dropped >> > Constraint 4 dropped >> > Constraint 14 dropped >> > >> > F( 13, 13) = 109.42 >> > Prob > F = 0.0000 >> > >> > I guess stata is doing something on the degree of >> freedoms, but I have not clear what is going on, why it is >> dropping the constraints. Is the final F test calculated >> correct? >> > Thank you in advance for any help * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: cluster and F test***From:*sara borelli <saraborelli77@yahoo.it>

**References**:**Re: st: cluster and F test***From:*"Austin Nichols" <austinnichols@gmail.com>

**Re: st: cluster and F test***From:*sara borelli <saraborelli77@yahoo.it>

- Prev by Date:
**RE: st: RE: connection to Stata** - Next by Date:
**Re: st: RE: connection to Stata** - Previous by thread:
**Re: st: cluster and F test** - Next by thread:
**Re: st: cluster and F test** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |