Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Factor variable notation vs. hand made dummy vars


From   "Lachenbruch, Peter" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: RE: Factor variable notation vs. hand made dummy vars
Date   Mon, 6 Feb 2012 08:09:29 -0800

it looks like you have two cells (1 & 2) that predict failure perfectly

. tab for rep78
           |                   Repair Record 1978
  Car type |         1          2          3          4          5 |     Total
-----------+-------------------------------------------------------+----------
  Domestic |         2          8         27          9          2 |        48 
   Foreign |         0          0          3          9          9 |        21 
-----------+-------------------------------------------------------+----------
     Total |         2          8         30         18         11 |        69 
If i use logit for mpg i.rep78, nolog i get

. logit for mpg i.rep78,nolog
note: 1.rep78 != 0 predicts failure perfectly
      1.rep78 dropped and 2 obs not used
note: 2.rep78 != 0 predicts failure perfectly
      2.rep78 dropped and 8 obs not used
note: 5.rep78 omitted because of collinearity
Logistic regression                               Number of obs   =         59
                                                  LR chi2(3)      =      25.87
                                                  Prob > chi2     =     0.0000
Log likelihood = -25.478287                       Pseudo R2       =     0.3367
------------------------------------------------------------------------------
     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |      0.131      0.071     1.85   0.064       -0.008       0.270
             |
       rep78 |
          1  |      0.000  (empty)
          2  |      0.000  (empty)
          3  |     -3.136      1.045    -3.00   0.003       -5.184      -1.089
          4  |     -1.120      0.974    -1.15   0.250       -3.029       0.789
          5  |      0.000  (omitted)
             |
       _cons |     -1.723      1.776    -0.97   0.332       -5.205       1.759
------------------------------------------------------------------------------
Then the fifth category becomes the intercept.


. logit for mpg d1-d5,nolog
note: d1 != 0 predicts failure perfectly
      d1 dropped and 2 obs not used
note: d2 != 0 predicts failure perfectly
      d2 dropped and 8 obs not used
note: d5 omitted because of collinearity
Logistic regression                               Number of obs   =         59
                                                  LR chi2(3)      =      25.87
                                                  Prob > chi2     =     0.0000
Log likelihood = -25.478287                       Pseudo R2       =     0.3367
------------------------------------------------------------------------------
     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |      0.131      0.071     1.85   0.064       -0.008       0.270
          d1 |      0.000  (omitted)
          d2 |      0.000  (omitted)
          d3 |     -3.136      1.045    -3.00   0.003       -5.184      -1.089
          d4 |     -1.120      0.974    -1.15   0.250       -3.029       0.789
          d5 |      0.000  (omitted)
       _cons |     -1.723      1.776    -0.97   0.332       -5.205       1.759
------------------------------------------------------------------------------


________________________________________
From: [email protected] [[email protected]] On Behalf Of Ulrich Kohler [[email protected]]
Sent: Monday, February 06, 2012 7:25 AM
To: [email protected]
Subject: st: Factor variable notation vs. hand made dummy vars

Hi all,

I cannot replicate the model

. sysuse auto, clear
. tab rep78, gen(d)
. logit for mpg d2-d5

with factor variable notation. I tried

. logit for mpg ib1.rep78

but results differ. Can anybody explain why?

(Note as an aside that

. logit for mpg d1-d5

reproduces the factor variables solution, but normally I would not
specify the model this way)


Update status
    Last check for updates:  06 Feb 2012
    New update available:    none         (as of 06 Feb 2012)
    Current update level:    30 Jan 2012  (what's new)


Uli







*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index