Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Why do I get two different results from the same specification and the same dataset?


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: Why do I get two different results from the same specification and the same dataset?
Date   Sun, 6 Nov 2011 14:28:05 +0000

The same point can be made differently. Yuval's story depends on (a) data that we cannot see and (b) in part on code that is not shown to us. There can be no objection to people working with their own data, naturally, but it doesn't make remote analysis any easier. However, any bug in Stata if it exists can be demonstrated with data accessible to all. More crucially, we have no way of checking Yuval's own creation of interactions. 

So, the best hypothesis on this evidence is that Yuval did something differently in the code not shown to us. 

Nick 
[email protected] 

Joerg Luedicke

You must have made a mistake when creating your interaction terms
"directly". I cannot think of any other explanation.

On Sun, Nov 6, 2011 at 7:24 AM, Yuval Arbel <[email protected]> wrote:

> when I run the following regression
>
> reg bid_win dev_cost bid_num year area units min min_price
> c.dev_cost#i.min c.bid_num#i.min c.year#i.min c.area#i.min
> c.units#i.min
>
> I get the following output:
>
>
>      Source |       SS       df       MS              Number of obs =    6802
> -------------+------------------------------           F( 12,  6789) = 2891.19
>       Model |  7.0107e+17    12  5.8423e+16           Prob > F      =  0.0000
>    Residual |  1.3719e+17  6789  2.0207e+13           R-squared     =  0.8363
> -------------+------------------------------           Adj R-squared =  0.8361
>       Total |  8.3826e+17  6801  1.2326e+14           Root MSE      =  4.5e+06
>
> ------------------------------------------------------------------------------
>     bid_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>    dev_cost |  -.0782451   .1319286    -0.59   0.553    -.3368666    .1803764
>     bid_num |   53637.98   18087.12     2.97   0.003     18181.54    89094.41
>        year |   61991.65   44544.85     1.39   0.164    -25330.21    149313.5
>        area |   204.3705   105.0742     1.95   0.052    -1.607834    410.3488
>       units |   52691.04   11756.39     4.48   0.000     29644.82    75737.26
>         min |   1.04e+08   9.74e+07     1.06   0.288    -8.73e+07    2.95e+08
>   min_price |   3.956053   .0241168   164.04   0.000     3.908777     4.00333
>             |
>         min#|
>  c.dev_cost |
>          1  |   -.460682   .1391518    -3.31   0.001    -.7334631    -.187901
>             |
>         min#|
>   c.bid_num |
>          1  |  -43194.68   19552.09    -2.21   0.027     -81522.9   -4866.457
>             |
>  min#c.year |
>          1  |  -51639.35   48543.57    -1.06   0.287      -146800    43521.27
>             |
>  min#c.area |
>          1  |   186.6343   110.1857     1.69   0.090    -29.36416    402.6327
>             |
>  min#c.units |
>          1  |  -128569.4   12642.23   -10.17   0.000    -153352.2   -103786.7
>             |
>       _cons |  -1.25e+08   8.94e+07    -1.39   0.163    -3.00e+08    5.05e+07
> ------------------------------------------------------------------------------
>
>
> But when I define directly the interaction variables, and run the
> regression, I get different outcomes:
>
> . reg bid_win dev_cost bid_num year area units min min_price
> dev_cost_int bid_num_int year_int area_int units_int
>
>      Source |       SS       df       MS              Number of obs =    6802
> -------------+------------------------------           F( 12,  6789) = 2840.90
>       Model |  6.9905e+17    12  5.8254e+16           Prob > F      =  0.0000
>    Residual |  1.3921e+17  6789  2.0505e+13           R-squared     =  0.8339
> -------------+------------------------------           Adj R-squared =  0.8336
>       Total |  8.3826e+17  6801  1.2326e+14           Root MSE      =  4.5e+06
>
> ------------------------------------------------------------------------------
>     bid_win |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>    dev_cost |   .3458744   .1259233     2.75   0.006     .0990254    .5927235
>     bid_num |   50612.77   18218.04     2.78   0.005     14899.71    86325.84
>        year |    26138.3   44731.32     0.58   0.559     -61549.1    113825.7
>        area |    796.392   88.98841     8.95   0.000     621.9468    970.8371
>       units |  -56322.78   4522.886   -12.45   0.000    -65189.06   -47456.51
>         min |   2.11e+07   9.78e+07     0.22   0.829    -1.71e+08    2.13e+08
>   min_price |   3.914549   .0241269   162.25   0.000     3.867252    3.961845
> dev_cost_int |  -.9575921   .1316807    -7.27   0.000    -1.215728   -.6994567
>  bid_num_int |  -40191.13   19694.51    -2.04   0.041    -78798.54   -1583.728
>    year_int |  -10450.87   48755.43    -0.21   0.830    -106026.8    85125.05
>    area_int |  -443.5883   91.15801    -4.87   0.000    -622.2866     -264.89
>   units_int |  -.2338972    .131622    -1.78   0.076    -.4919176    .0241233
>       _cons |  -5.29e+07   8.98e+07    -0.59   0.556    -2.29e+08    1.23e+08
> ------------------------------------------------------------------------------
>
> My question is why do I get two different results from the same specification?
> Just to exemplify: note that the coefficient of "dev_cost" has
> modified signs and became significant
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index