Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: Factor variable notation vs. hand made dummy vars

 From "Lachenbruch, Peter" To "statalist@hsphsun2.harvard.edu" Subject st: RE: Factor variable notation vs. hand made dummy vars Date Mon, 6 Feb 2012 08:09:29 -0800

```it looks like you have two cells (1 & 2) that predict failure perfectly

. tab for rep78
|                   Repair Record 1978
Car type |         1          2          3          4          5 |     Total
-----------+-------------------------------------------------------+----------
Domestic |         2          8         27          9          2 |        48
Foreign |         0          0          3          9          9 |        21
-----------+-------------------------------------------------------+----------
Total |         2          8         30         18         11 |        69
If i use logit for mpg i.rep78, nolog i get

. logit for mpg i.rep78,nolog
note: 1.rep78 != 0 predicts failure perfectly
1.rep78 dropped and 2 obs not used
note: 2.rep78 != 0 predicts failure perfectly
2.rep78 dropped and 8 obs not used
note: 5.rep78 omitted because of collinearity
Logistic regression                               Number of obs   =         59
LR chi2(3)      =      25.87
Prob > chi2     =     0.0000
Log likelihood = -25.478287                       Pseudo R2       =     0.3367
------------------------------------------------------------------------------
foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg |      0.131      0.071     1.85   0.064       -0.008       0.270
|
rep78 |
1  |      0.000  (empty)
2  |      0.000  (empty)
3  |     -3.136      1.045    -3.00   0.003       -5.184      -1.089
4  |     -1.120      0.974    -1.15   0.250       -3.029       0.789
5  |      0.000  (omitted)
|
_cons |     -1.723      1.776    -0.97   0.332       -5.205       1.759
------------------------------------------------------------------------------
Then the fifth category becomes the intercept.

. logit for mpg d1-d5,nolog
note: d1 != 0 predicts failure perfectly
d1 dropped and 2 obs not used
note: d2 != 0 predicts failure perfectly
d2 dropped and 8 obs not used
note: d5 omitted because of collinearity
Logistic regression                               Number of obs   =         59
LR chi2(3)      =      25.87
Prob > chi2     =     0.0000
Log likelihood = -25.478287                       Pseudo R2       =     0.3367
------------------------------------------------------------------------------
foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg |      0.131      0.071     1.85   0.064       -0.008       0.270
d1 |      0.000  (omitted)
d2 |      0.000  (omitted)
d3 |     -3.136      1.045    -3.00   0.003       -5.184      -1.089
d4 |     -1.120      0.974    -1.15   0.250       -3.029       0.789
d5 |      0.000  (omitted)
_cons |     -1.723      1.776    -0.97   0.332       -5.205       1.759
------------------------------------------------------------------------------

________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ulrich Kohler [kohler@wzb.eu]
Sent: Monday, February 06, 2012 7:25 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: Factor variable notation vs. hand made dummy vars

Hi all,

I cannot replicate the model

. sysuse auto, clear
. tab rep78, gen(d)
. logit for mpg d2-d5

with factor variable notation. I tried

. logit for mpg ib1.rep78

but results differ. Can anybody explain why?

(Note as an aside that

. logit for mpg d1-d5

reproduces the factor variables solution, but normally I would not
specify the model this way)

Update status
Last check for updates:  06 Feb 2012
New update available:    none         (as of 06 Feb 2012)
Current update level:    30 Jan 2012  (what's new)

Uli

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```