Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: bug in -anova-


From   "Airey, David C" <david.airey@Vanderbilt.Edu>
To   Statalist <statalist@hsphsun2.harvard.edu>
Subject   st: bug in -anova-
Date   Sat, 12 Dec 2009 20:59:50 -0600

I think I am noticing a bug in -anova- in Stata 11 in relation to automatic handling of categorical variables. Notice how if I don't tell -anova- a variable is categorical, it misses a level when noconstant is used.
 
. anova htr pretreat##strain // normal result

                           Number of obs =      36     R-squared     =  0.8933
                           Root MSE      = 7.77827     Adj R-squared =  0.8755

                  Source |  Partial SS    df       MS           F     Prob > F
         ----------------+----------------------------------------------------
                   Model |  15189.5106     5  3037.90212      50.21     0.0000
                         |
                pretreat |   11977.163     2  5988.58148      98.98     0.0000
                  strain |  1953.15158     1  1953.15158      32.28     0.0000
         pretreat#strain |  1259.19604     2  629.598022      10.41     0.0004
                         |
                Residual |  1815.04618    30  60.5015394   
         ----------------+----------------------------------------------------
                   Total |  17004.5568    35  485.844479   

. 
. egen c = group(pretreat strain) // make variable for cell means model

. table pretreat strain, c(mean c) // show cell means grouping variable

------------------------------
          |       strain      
 pretreat | C57BL/6J    DBA/2J
----------+-------------------
 SB206553 |        1         2
 SB242084 |        3         4
   Saline |        5         6
------------------------------

. anova htr ibn.c, noconstant // run cell means model ***CORRECT with 6 df***

                           Number of obs =      36     R-squared     =  0.9624
                           Root MSE      = 7.77827     Adj R-squared =  0.9548

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  46410.4313     6  7735.07188     127.85     0.0000
                         |
                       c |  46410.4313     6  7735.07188     127.85     0.0000
                         |
                Residual |  1815.04618    30  60.5015394   
              -----------+----------------------------------------------------
                   Total |  48225.4775    36   1339.5966   

. anova htr c, noconstant // run cell means model ***WRONG with 5 df***

                           Number of obs =      36     R-squared     =  0.9496
                           Root MSE      = 8.85083     Adj R-squared =  0.9415

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  45797.0261     5  9159.40521     116.92     0.0000
                         |
                       c |  45797.0261     5  9159.40521     116.92     0.0000
                         |
                Residual |  2428.45141    31  78.3371423   
              -----------+----------------------------------------------------
                   Total |  48225.4775    36   1339.5966   

. regress

      Source |       SS       df       MS              Number of obs =      36
-------------+------------------------------           F(  5,    31) =  116.92
       Model |  45797.0261     5  9159.40521           Prob > F      =  0.0000
    Residual |  2428.45141    31  78.3371423           R-squared     =  0.9496
-------------+------------------------------           Adj R-squared =  0.9415
       Total |  48225.4775    36   1339.5966           Root MSE      =  8.8508

------------------------------------------------------------------------------
         htr |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           c |
          2  |   14.16667   3.613335     3.92   0.000     6.797221    21.53611
          3  |   17.08333   3.613335     4.73   0.000     9.713888    24.45278
          4  |         26   3.613335     7.20   0.000     18.63055    33.36945
          5  |   39.05555   3.613335    10.81   0.000      31.6861      46.425
          6  |   70.27778   3.613335    19.45   0.000     62.90834    77.64723
------------------------------------------------------------------------------

. test, showorder

 Order of columns in the design matrix
      1: (c==1)
      2: (c==2)
      3: (c==3)
      4: (c==4)
      5: (c==5)
      6: (c==6)


I know that at <http://www.stata.com/support/faqs/stat/test1.html> the ibn.c notation is used, but this should be fixed if I'm not missing anything.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index