# st: RE: RE: RE: -robvar- and number of degrees of freedom

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: RE: RE: -robvar- and number of degrees of freedom Date Fri, 3 Sep 2010 12:37:02 +0100

```Another take on this is that -robvar- is doing what you are asking for, but what is misleading is that -tabulate- (called by -robvar-) is not showing the missing categories on the by-variable.

However, it is common practice elsewhere that missings on a by-variable are excluded by default (and, at least sometimes, that you would need to opt explicitly for missings to be included).

I think -robvar- needs a fix either way.

Nick
n.j.cox@durham.ac.uk

Nick Cox (2)

An even simpler alternative, while we await StataCorp's comment, is to exclude the missings explicitly:

. robvar mpg if !missing(rep78), by(rep78)

Nick Cox (1)

Well spotted.

My guess is that -robvar- is not marking out missings on the -by()- variable, which are falsely included in the calculation of groups. If you pursue a simple experiment the correct answer follows.

. drop if missing(rep78)
(5 observations deleted)

. robvar mpg, by(rep78)

Repair |      Summary of Mileage (mpg)
Record 1978 |        Mean   Std. Dev.       Freq.
------------+------------------------------------
1 |          21   4.2426407           2
2 |      19.125   3.7583241           8
3 |   19.433333   4.1413252          30
4 |   21.666667   4.9348699          18
5 |   27.363636   8.7323849          11
------------+------------------------------------
Total |   21.289855   5.8664085          69

W0  =  5.8525980   df(4, 64)     Pr > F = 0.00044531

W50 =  4.0610367   df(4, 64)     Pr > F = 0.00537416

W10 =  6.1590202   df(4, 64)     Pr > F = 0.00029485

I've not looked at all the code but my guess is that

marksample touse

should be followed by something like

markout `touse' `by', strok

but naturally you should not make this change on the official -robvar-. At most, clone -robvar- and check whether this works.

Nick
n.j.cox@durham.ac.uk

Garry Anderson

The -robvar- command does not seem to be reporting the correct degrees
of freedom.

. webuse auto
(1978 Automobile Data)

. robvar mpg,by(rep78)

Repair |      Summary of Mileage (mpg)
Record 1978 |        Mean   Std. Dev.       Freq.
------------+------------------------------------
1 |          21   4.2426407           2
2 |      19.125   3.7583241           8
3 |   19.433333   4.1413252          30
4 |   21.666667   4.9348699          18
5 |   27.363636   8.7323849          11
------------+------------------------------------
Total |   21.289855   5.8664085          69

W0  =  4.7219575   df(5, 68)     Pr > F = 0.00092356

W50 =  3.2906157   df(5, 68)     Pr > F = 0.01014559

W10 =  4.9744717   df(5, 68)     Pr > F = 0.00061062

I would have expected the 5 df to be 4 df because there are 5 groups.

```