Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Why does Stata drop an additional category?


From   "Sarah A. Mustillo" <smustillo@psych.duhs.duke.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: Why does Stata drop an additional category?
Date   Tue, 17 Feb 2004 09:46:20 -0500

I think I am going to have to disagree with Rich on this one and say that perhaps the additional category is being dropped because of the interaction term.

Yaojun is specifying that the highest education category be used as the reference category, instead of the default, yet the lowest is being dropped anyway. Looking at his command, he is including education as a main effect, and then what appears to be a centered education variable in the interaction effect. using xi to create the interaction effect automatically includes the main effects as well though. So, I think he is entering education as a main effect twice, which is probably why that extra category is being dropped due to collinearity.

Sarah

xi: logistic unemployed i.religion i.educ5 i.cath*e5c






I would like to ask you for advice. I am doing a logit on unemployment
and my independent variables are:

religion	1=Protestant; 2=Catholic; 3=None
cath		1=Catholic; 0=Other
educ5		1=Degree; 2=Prof below degree; 3=A Levels; 4=O Levels;
5=None
e5c		-2=None; -1=O Level; 0=A Levels; 1=Prof below degree;
2=Degree

gen	cath=religion==2 if religion<=3
char 	religion[omit]	1 	/*	Protestants	=base*/
char 	educ5[omit]		5 	/*	None/Prim	=base*/

I wish to use the main effects of religion and educ5, and the
interaction effects for cath*e5c.

The first four of the following commands all end in _Ieduc5_1 dropped
due to collinearity; the last gives the coefficient for this category
but the model statistics seems strange: no constant and no comparability
with the output in SPSS for the same data (the syntax and the results
are attached).

xi: logistic unemployed  i.religion i.educ5 i.cath*e5c 	
xi: logistic unemployed  i.religion i.educ5 i.cath*e5c,coef	
xi: logit    unemployed  i.religion i.educ5 i.cath*e5c, or	
xi: logit    unemployed  i.religion i.educ5 i.cath*e5c	
xi: logit    unemployed  i.religion i.educ5 i.cath*e5c,nocons or

I would be most grateful if anyone could explain to me why in the first
four commands the first category of education (_Ieduc5_1) is dropped (I
used both Stata 7 and Stata 8); and how to obtain an output similar to
that in SPSS. The results for models 2-4 are not presented because they
are very similar to results from model 1.

Many thanks.
Yours sincerely
Yaojun Li

************************************************************************
****
*Stata outputs for model 1:
. xi: logistic unemployed  i.religion i.educ5 i.cath*e5c
/*_Ieduc5_1 dropped*/
i.religion        _Ireligion_1-3      (naturally coded; _Ireligion_1
omitted)
i.educ5           _Ieduc5_1-5         (naturally coded; _Ieduc5_5
omitted)
i.cath            _Icath_0-1          (naturally coded; _Icath_0
omitted)
i.cath*e5c        _IcatXe5c_#         (coded as above)

note: _Ieduc5_1 dropped due to collinearity
note: _Icath_1 dropped due to collinearity

Logistic regression                               Number of obs   =
1936
                                                   LR chi2(7)      =
186.93
                                                   Prob > chi2     =
0.0000
Log likelihood = -934.26692                       Pseudo R2       =
0.0909

------------------------------------------------------------------------
------
   unemployed | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
_Ireligion_2 |   3.150993   .5743082     6.30   0.000     2.204475
4.503909
_Ireligion_3 |   2.095747   .6119331     2.53   0.011     1.182492
3.714322
    _Ieduc5_2 |   2.469711   1.182263     1.89   0.059     .9664331
6.311325
    _Ieduc5_3 |   2.554673    .851014     2.82   0.005     1.329789
4.907813
    _Ieduc5_4 |   .8599691   .2137221    -0.61   0.544     .5283719
1.399671
          e5c |   .4881296   .0769878    -4.55   0.000     .3583303
.6649465
  _IcatXe5c_1 |   1.079315   .1140581     0.72   0.470     .8773971
1.3277
------------------------------------------------------------------------
------

(results from models 2-4 omitted)

*Stata output for model 5:
. xi: logit    unemployed  i.religion i.educ5 i.cath*e5c,nocons or
i.religion        _Ireligion_1-3      (naturally coded; _Ireligion_1
omitted)
i.educ5           _Ieduc5_1-5         (naturally coded; _Ieduc5_5
omitted)
i.cath            _Icath_0-1          (naturally coded; _Icath_0
omitted)
i.cath*e5c        _IcatXe5c_#         (coded as above)

note: _Icath_1 dropped due to collinearity
Iteration 0:   log likelihood = -1341.9329
Iteration 1:   log likelihood = -956.23273
Iteration 2:   log likelihood = -936.45777
Iteration 3:   log likelihood = -934.40179
Iteration 4:   log likelihood = -934.26857
Iteration 5:   log likelihood = -934.26692

Logit estimates                                   Number of obs   =
1936
                                                   LR chi2(8)      =
.
Log likelihood = -934.26692                       Prob > chi2     =
.

------------------------------------------------------------------------
------
   unemployed | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
_Ireligion_2 |   3.150993   .5742976     6.30   0.000     2.204489
4.503879
_Ireligion_3 |   2.095747   .6119294     2.53   0.011     1.182496
3.714309
    _Ieduc5_1 |   .0037776   .0023571    -8.94   0.000      .001112
.0128333
    _Ieduc5_2 |   .0376322   .0088729   -13.91   0.000     .0237062
.0597387
    _Ieduc5_3 |    .157016   .0276982   -10.50   0.000     .1111192
.2218702
    _Ieduc5_4 |      .2132   .0434898    -7.58   0.000       .14294
.3179952
          e5c |   1.968932   .1016632    13.12   0.000     1.779427
2.17862
  _IcatXe5c_1 |   1.079315   .1140558     0.72   0.470     .8774008
1.327695
------------------------------------------------------------------------
----




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



Sarah A. Mustillo, Ph.D
Department of Psychiatry and Behavioral Sciences
Duke University School of Medicine
Box 3454
Durham NC 27710

919 687-4686 x231
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index