[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Ángel Rodríguez Laso" <angelrlaso@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: cluster and F test |

Date |
Tue, 8 Jul 2008 11:07:53 +0200 |

Following the discussion, I don´t understand very well how degrees of freedom (number of clusters-number of strata) and the actual number of observations are used in svy commands (which are related to cluster regression). I say so because when I calculate the sample size needed in a survey to get a proportion with a determined confidence level, the number I get is the number of observations and not the number of degrees of freedom. So I assume that the number of observations is what conditions the standard error and then I don´t know what degrees of freedom are used for. Cheers, Ángel Rodríguez 2008/7/7, sara borelli <saraborelli77@yahoo.it>: > Austin, > > thank you very much for your help, > Sara > > --- Dom 6/7/08, Austin Nichols <austinnichols@gmail.com> ha scritto: > > > Da: Austin Nichols <austinnichols@gmail.com> > > Oggetto: Re: st: cluster and F test > > A: statalist@hsphsun2.harvard.edu > > Data: Domenica 6 luglio 2008, 16:05 > > sara borelli <saraborelli77@yahoo.it> : > > An individual SE may be OK, in the sense that a test > > involving only > > one coef may have approximately the right size, but e(V) > > has rank M-1 > > and so the upper limit on the number of coefs that can > > included in one > > joint test is M-1. The reported SEs ignore the cov between > > the 37 > > estimates; they offer a test of one coef each, ignoring the > > fact that > > you can't actually test all 37, or even 14, jointly. > > But in this > > case, a test of even one coef is suspect, because you have > > M-1=13 > > which is a very small number to consider close to infinity. > > 50 > > clusters, or at the very least 20 large balanced clusters, > > are needed > > to be reasonably sure the size distortion is not too large. > > In > > general, it probably seems like a bad idea to include more > > variables > > than you have effective df, though for the CRSE, Stata will > > let you do > > it, for various reasons. For example, if you had 50 > > clusters, 50 > > fixed effects for cluster and 120 fixed effects for time, > > you could > > include these 170 effects as 168 dummy variables along with > > one > > explanatory variable of interest. You can never test the > > joint sig of > > the cluster FE nor the joint sig of the time FE, and you > > will (one > > hopes) not be testing smaller groups of these FE either, so > > the only > > test you plan to do in this case is on the one explanatory > > variable of > > interest, with 49 df. In this case, you should be fine. > > Note the > > relevant number is M-k, number of clusters less number of > > constraints. > > > > --Austin > > > > On Sun, Jul 6, 2008 at 5:14 AM, sara borelli > > <saraborelli77@yahoo.it> wrote: > > > Hi [Austin], > > > thank you very much for your help. > > > > > > When I test (with the F-test) 18 restrictions with 14 > > clusters stata drops the 5 constraints because, as you > > said, it can test only 13 constraints. > > > There is something I do not understand, however. With > > the cluster option the number of observations useful to > > estimate the standard errors becomes the number of > > clusters, 14. Thus, if I have 37 standard errors to > > estimate and only 14 clusters, how is that possible that > > stata is able to estimate all the standar errors, but still > > test only 13 constraints? > > > Basically, when the number of clusters is smaller than > > the number of regressors, is only the F-test computed in a > > wrong way or also the standar errors? > > > I am sorry to keep usking about this, but I ma a bit > > confused > > > Thank you > > > Sara > > > > > > --- Sab 5/7/08, Austin Nichols > > <austinnichols@gmail.com> ha scritto: > > > > > >> Da: Austin Nichols <austinnichols@gmail.com> > > >> Oggetto: Re: st: cluster and F test > > >> A: statalist@hsphsun2.harvard.edu > > >> Data: Sabato 5 luglio 2008, 19:01 > > >> sara borelli <saraborelli77@yahoo.it>: > > >> The cluster-robust standard error (CRSE) estimator > > has at > > >> most M-1 df > > >> with M clusters, so with 14 clusters you can test > > the joint > > >> sig. of at > > >> most 13 coefs. But the performance of the > > estimator gets > > >> worse as you > > >> increase the the number of constraints. The > > CRSE's > > >> performance > > >> improves as M-k increases toward infinity, where M > > is the > > >> number of > > >> clusters and k the number of constraints you are > > testing, > > >> and for M-k > > >> at least 20 and clusters balanced you should > > expect good > > >> performance. > > >> Since you have M-k equal to one (the minimum > > possible > > >> value), you > > >> should expect that the estimated variance is too > > low and > > >> the F stat is > > >> too high, on average. Note that clusters are like > > >> super-observations, > > >> for the purposes of the SE of estimated coefs, so > > a > > >> regression on 37 > > >> variables with 14 clusters is a bit like a > > regression on 37 > > >> vars with > > >> 14 obs--you really don't want to test more > > than one > > >> coef there, and > > >> maybe not even that many. How are your clusters > > defined? > > >> Is there > > >> any possibility of adding more clusters, or > > redefining them > > >> sensibly > > >> so you have more clusters? > > >> > > >> On Fri, Jul 4, 2008 at 5:16 AM, sara borelli > > >> <saraborelli77@yahoo.it> wrote: > > >> > Dear Stata List members, > > >> > > > >> > I have found some related questions on FAQs, > > but I > > >> cannot fins exactly what I need. > > >> > I am running a regression with the cluster > > option. I > > >> have 37 independent variables (including the > > constant), > > >> 1647 observations, and 14 clusters. > > >> > I want to test the joint significance of 18 > > variables. > > >> > If I do NOT use the cluster option the F is > > calculated > > >> correctly as F(18, 1637). > > >> > But once I introduce the cluster option I get > > the > > >> following result: > > >> > (1) x1= 0 > > >> > (2) x2 = 0 > > >> > (3) x3 = 0 > > >> > (4) x3 = 0 > > >> > ... > > >> > (18) x18 = 0 > > >> > Constraint 1 dropped > > >> > Constraint 2 dropped > > >> > Constraint 3 dropped > > >> > Constraint 4 dropped > > >> > Constraint 14 dropped > > >> > > > >> > F( 13, 13) = 109.42 > > >> > Prob > F = 0.0000 > > >> > > > >> > I guess stata is doing something on the > > degree of > > >> freedoms, but I have not clear what is going on, > > why it is > > >> dropping the constraints. Is the final F test > > calculated > > >> correct? > > >> > Thank you in advance for any help > > * > > * For searches and help try: > > * http://www.stata.com/support/faqs/res/findit.html > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > Posta, news, sport, oroscopo: tutto in una sola pagina. > Crea l'home page che piace a te! > www.yahoo.it/latuapagina > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: cluster and F test***From:*Steven Samuels <sjhsamuels@earthlink.net>

**References**:**Re: st: cluster and F test***From:*"Austin Nichols" <austinnichols@gmail.com>

**Re: st: cluster and F test***From:*sara borelli <saraborelli77@yahoo.it>

- Prev by Date:
**RE: st: programming loops for regressions** - Next by Date:
**st: switching model** - Previous by thread:
**Re: st: cluster and F test** - Next by thread:
**Re: st: cluster and F test** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |