Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Why do I get two different results from the same specification and the same dataset?


From   Yuval Arbel <yuval.arbel@gmail.com>
To   statalist <statalist@hsphsun2.harvard.edu>
Subject   st: Why do I get two different results from the same specification and the same dataset?
Date   Sun, 6 Nov 2011 14:24:19 +0200

Dear statalist participants,

when I run the following regression

reg bid_win dev_cost bid_num year area units min min_price
c.dev_cost#i.min c.bid_num#i.min c.year#i.min c.area#i.min
c.units#i.min

I get the following output:


      Source |       SS       df       MS              Number of obs =    6802
-------------+------------------------------           F( 12,  6789) = 2891.19
       Model |  7.0107e+17    12  5.8423e+16           Prob > F      =  0.0000
    Residual |  1.3719e+17  6789  2.0207e+13           R-squared     =  0.8363
-------------+------------------------------           Adj R-squared =  0.8361
       Total |  8.3826e+17  6801  1.2326e+14           Root MSE      =  4.5e+06

------------------------------------------------------------------------------
     bid_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    dev_cost |  -.0782451   .1319286    -0.59   0.553    -.3368666    .1803764
     bid_num |   53637.98   18087.12     2.97   0.003     18181.54    89094.41
        year |   61991.65   44544.85     1.39   0.164    -25330.21    149313.5
        area |   204.3705   105.0742     1.95   0.052    -1.607834    410.3488
       units |   52691.04   11756.39     4.48   0.000     29644.82    75737.26
         min |   1.04e+08   9.74e+07     1.06   0.288    -8.73e+07    2.95e+08
   min_price |   3.956053   .0241168   164.04   0.000     3.908777     4.00333
             |
         min#|
  c.dev_cost |
          1  |   -.460682   .1391518    -3.31   0.001    -.7334631    -.187901
             |
         min#|
   c.bid_num |
          1  |  -43194.68   19552.09    -2.21   0.027     -81522.9   -4866.457
             |
  min#c.year |
          1  |  -51639.35   48543.57    -1.06   0.287      -146800    43521.27
             |
  min#c.area |
          1  |   186.6343   110.1857     1.69   0.090    -29.36416    402.6327
             |
 min#c.units |
          1  |  -128569.4   12642.23   -10.17   0.000    -153352.2   -103786.7
             |
       _cons |  -1.25e+08   8.94e+07    -1.39   0.163    -3.00e+08    5.05e+07
------------------------------------------------------------------------------


But when I define directly the interaction variables, and run the
regression, I get different outcomes:

. reg bid_win dev_cost bid_num year area units min min_price
dev_cost_int bid_num_int year_int area_int units_int

      Source |       SS       df       MS              Number of obs =    6802
-------------+------------------------------           F( 12,  6789) = 2840.90
       Model |  6.9905e+17    12  5.8254e+16           Prob > F      =  0.0000
    Residual |  1.3921e+17  6789  2.0505e+13           R-squared     =  0.8339
-------------+------------------------------           Adj R-squared =  0.8336
       Total |  8.3826e+17  6801  1.2326e+14           Root MSE      =  4.5e+06

------------------------------------------------------------------------------
     bid_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    dev_cost |   .3458744   .1259233     2.75   0.006     .0990254    .5927235
     bid_num |   50612.77   18218.04     2.78   0.005     14899.71    86325.84
        year |    26138.3   44731.32     0.58   0.559     -61549.1    113825.7
        area |    796.392   88.98841     8.95   0.000     621.9468    970.8371
       units |  -56322.78   4522.886   -12.45   0.000    -65189.06   -47456.51
         min |   2.11e+07   9.78e+07     0.22   0.829    -1.71e+08    2.13e+08
   min_price |   3.914549   .0241269   162.25   0.000     3.867252    3.961845
dev_cost_int |  -.9575921   .1316807    -7.27   0.000    -1.215728   -.6994567
 bid_num_int |  -40191.13   19694.51    -2.04   0.041    -78798.54   -1583.728
    year_int |  -10450.87   48755.43    -0.21   0.830    -106026.8    85125.05
    area_int |  -443.5883   91.15801    -4.87   0.000    -622.2866     -264.89
   units_int |  -.2338972    .131622    -1.78   0.076    -.4919176    .0241233
       _cons |  -5.29e+07   8.98e+07    -0.59   0.556    -2.29e+08    1.23e+08
------------------------------------------------------------------------------

My question is why do I get two different results from the same specification?
Just to exemplify: note that the coefficient of "dev_cost" has
modified signs and became significant

-- 
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street, Haifa, Israel
e-mail: yuval.arbel@gmail.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index