Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: [Why does Stata drop an additional category?]

From   Richard Williams <>
Subject   Re: st: [Why does Stata drop an additional category?]
Date   Tue, 17 Feb 2004 09:26:43 -0500

At 08:26 AM 2/17/2004 -0500, Marcello Pagano wrote:
The first four of the following commands all end in _Ieduc5_1 dropped
due to collinearity; the last gives the coefficient for this category
but the model statistics seems strange: no constant and no comparability
with the output in SPSS for the same data (the syntax and the results
are attached).

xi: logistic unemployed  i.religion i.educ5 i.cath*e5c
xi: logistic unemployed  i.religion i.educ5 i.cath*e5c,coef
xi: logit    unemployed  i.religion i.educ5 i.cath*e5c, or
xi: logit    unemployed  i.religion i.educ5 i.cath*e5c
xi: logit    unemployed  i.religion i.educ5 i.cath*e5c,nocons or

I would be most grateful if anyone could explain to me why in the first
four commands the first category of education (_Ieduc5_1) is dropped (I
used both Stata 7 and Stata 8); and how to obtain an output similar to
that in SPSS. The results for models 2-4 are not presented because they
are very similar to results from model 1.
With regards to educ5, something has to be dropped; if you have a 5 category variable only 4 dummies can be created from it and included in the model. When you specify nonconstant you make it possible for all 5 categories of educ5 to be entered, but of course you lose the constant then.

With regards to SPSS versus Stata: Stata is reporting the odds ratios by default. If you add the parameter -,coef- to the end of your logistic commands, you'll get the coefficient estimates instead. Or, use the -logit- instead of the -logistic- command, as -logit' reports coefficient estimates by default. It looks like you are doing that, so...

Other seeming differences may reflect differences in how the dummy variable coding is handled, i.e. Stata and SPSS may be using different reference categories. -xi- uses the smallest/first value as the reference category. SPSS Logistic, I believe, uses the highest/last category as the reference. So, you'll get algebraically equivalent results, but the coefficients will look different because different reference categories are used.

If you look at the SPSS documentation for Logistic, you'll see how you can change the reference category. Or, in Stata, you can change the reference category with a command like

char varname[omit] #

e.g. char educ5[omit] 5

In short, I am guessing your results are equivalent, but SPSS and Stata are reporting the results in different but equivalent ways.

* For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index