Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: RE: -robvar- and number of degrees of freedom
From 
 
Nick Cox <[email protected]> 
To 
 
"'[email protected]'" <[email protected]> 
Subject 
 
st: RE: RE: RE: -robvar- and number of degrees of freedom 
Date 
 
Fri, 3 Sep 2010 12:37:02 +0100 
Another take on this is that -robvar- is doing what you are asking for, but what is misleading is that -tabulate- (called by -robvar-) is not showing the missing categories on the by-variable. 
However, it is common practice elsewhere that missings on a by-variable are excluded by default (and, at least sometimes, that you would need to opt explicitly for missings to be included). 
I think -robvar- needs a fix either way. 
Nick 
[email protected] 
Nick Cox (2) 
An even simpler alternative, while we await StataCorp's comment, is to exclude the missings explicitly:
. robvar mpg if !missing(rep78), by(rep78) 
Nick Cox (1) 
Well spotted. 
My guess is that -robvar- is not marking out missings on the -by()- variable, which are falsely included in the calculation of groups. If you pursue a simple experiment the correct answer follows. 
. drop if missing(rep78)
(5 observations deleted)
. robvar mpg, by(rep78)
     Repair |      Summary of Mileage (mpg)
Record 1978 |        Mean   Std. Dev.       Freq.
------------+------------------------------------
          1 |          21   4.2426407           2
          2 |      19.125   3.7583241           8
          3 |   19.433333   4.1413252          30
          4 |   21.666667   4.9348699          18
          5 |   27.363636   8.7323849          11
------------+------------------------------------
      Total |   21.289855   5.8664085          69
W0  =  5.8525980   df(4, 64)     Pr > F = 0.00044531
W50 =  4.0610367   df(4, 64)     Pr > F = 0.00537416
W10 =  6.1590202   df(4, 64)     Pr > F = 0.00029485
I've not looked at all the code but my guess is that 
	marksample touse
should be followed by something like 
	markout `touse' `by', strok 
but naturally you should not make this change on the official -robvar-. At most, clone -robvar- and check whether this works. 
Nick 
[email protected] 
Garry Anderson
The -robvar- command does not seem to be reporting the correct degrees
of freedom.
. webuse auto
(1978 Automobile Data)
. robvar mpg,by(rep78)
     Repair |      Summary of Mileage (mpg)
Record 1978 |        Mean   Std. Dev.       Freq.
------------+------------------------------------
          1 |          21   4.2426407           2
          2 |      19.125   3.7583241           8
          3 |   19.433333   4.1413252          30
          4 |   21.666667   4.9348699          18
          5 |   27.363636   8.7323849          11
------------+------------------------------------
      Total |   21.289855   5.8664085          69
W0  =  4.7219575   df(5, 68)     Pr > F = 0.00092356
W50 =  3.2906157   df(5, 68)     Pr > F = 0.01014559
W10 =  4.9744717   df(5, 68)     Pr > F = 0.00061062
I would have expected the 5 df to be 4 df because there are 5 groups.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/