Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: RE: -robvar- and number of degrees of freedom


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: RE: RE: -robvar- and number of degrees of freedom
Date   Fri, 3 Sep 2010 12:37:02 +0100

Another take on this is that -robvar- is doing what you are asking for, but what is misleading is that -tabulate- (called by -robvar-) is not showing the missing categories on the by-variable. 

However, it is common practice elsewhere that missings on a by-variable are excluded by default (and, at least sometimes, that you would need to opt explicitly for missings to be included). 

I think -robvar- needs a fix either way. 

Nick 
[email protected] 

Nick Cox (2) 

An even simpler alternative, while we await StataCorp's comment, is to exclude the missings explicitly:

. robvar mpg if !missing(rep78), by(rep78) 

Nick Cox (1) 

Well spotted. 

My guess is that -robvar- is not marking out missings on the -by()- variable, which are falsely included in the calculation of groups. If you pursue a simple experiment the correct answer follows. 

. drop if missing(rep78)
(5 observations deleted)

. robvar mpg, by(rep78)

     Repair |      Summary of Mileage (mpg)
Record 1978 |        Mean   Std. Dev.       Freq.
------------+------------------------------------
          1 |          21   4.2426407           2
          2 |      19.125   3.7583241           8
          3 |   19.433333   4.1413252          30
          4 |   21.666667   4.9348699          18
          5 |   27.363636   8.7323849          11
------------+------------------------------------
      Total |   21.289855   5.8664085          69

W0  =  5.8525980   df(4, 64)     Pr > F = 0.00044531

W50 =  4.0610367   df(4, 64)     Pr > F = 0.00537416

W10 =  6.1590202   df(4, 64)     Pr > F = 0.00029485

I've not looked at all the code but my guess is that 

	marksample touse

should be followed by something like 

	markout `touse' `by', strok 

but naturally you should not make this change on the official -robvar-. At most, clone -robvar- and check whether this works. 

Nick 
[email protected] 

Garry Anderson

The -robvar- command does not seem to be reporting the correct degrees
of freedom.

. webuse auto
(1978 Automobile Data)

. robvar mpg,by(rep78)

     Repair |      Summary of Mileage (mpg)
Record 1978 |        Mean   Std. Dev.       Freq.
------------+------------------------------------
          1 |          21   4.2426407           2
          2 |      19.125   3.7583241           8
          3 |   19.433333   4.1413252          30
          4 |   21.666667   4.9348699          18
          5 |   27.363636   8.7323849          11
------------+------------------------------------
      Total |   21.289855   5.8664085          69

W0  =  4.7219575   df(5, 68)     Pr > F = 0.00092356

W50 =  3.2906157   df(5, 68)     Pr > F = 0.01014559

W10 =  4.9744717   df(5, 68)     Pr > F = 0.00061062


I would have expected the 5 df to be 4 df because there are 5 groups.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index