Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Understanding Factor variables - is order significant ?

From   "Michael N. Mitchell" <>
Subject   Re: st: Understanding Factor variables - is order significant ?
Date   Tue, 25 May 2010 18:32:00 -0700

Dear Jesper

This is not really a "Stata" issue, but is an issue regarding the coding of dummy variables. Let's take an even simpler case, suppose you have a variable named "female" that is coded 1 if a someone is a female, and 0 if they are a male. But, instead, you change your mind and include the coefficient for a variable named "male" that is coded 1 if you are a male, and 0 otherwise. The coefficient will change (in the case of a linear model, it will be of opposite sign) but note the p value will remain the same as it still remains a test of the difference between males and females.

Extend that idea to your interaction... Suppose you flip the coding of your "ra" and "dm" variables. Note that the test of the interaction, the p value, will remain the same (assuming both are dummy variables). The coefficients of "ra" and "dm" will change as well, due to the change in coding. The details get more complicated, but are explained in section 3.5 of . It is explained using the old "xi" terminology, but the issues still are the same.

  I hope that helps,

Best luck,

Michael N. Mitchell
Data Management Using Stata      -
A Visual Guide to Stata Graphics -
Stata tidbit of the week         -

On 2010-05-25 3.22 PM, Jesper Lindhardsen wrote:
Dear Statalisters,

I am having a hard time understanding why 2 regression models that
differ only by the "order" of the included factor variables yield
different results???
I can't (or am too slow to) find the answer in the documentation, but I
think it is related to the parsing of the baselevel specifiers (see
model 1 legend = _b[] ???).

Here are the 2 commands and resulting output - as you can see I've only
changed to Output has been edited, but only
left out if identical between models.

(System: Stata 11/MP for windows, born 10 feb 2010)

poisson _d i.alder_k sex if ex==0, e(risk_tid) irr
_d          IRR       Legend
0 0     1.487748  _b[]
0 1     1.968017  _b[]
1 1     2.787839  _b[]

1     6.176815  _b[1.alder_k]
2     18.09798  _b[2.alder_k]

sex    2.070646  _b[sex]
risk_tid  (exposure)

poisson _d i.alder_k sex  if ex==0, e(risk_tid) irr

_d         IRR         Legend
0 0     .5935912  _b[]
1 0     1.169963  _b[]
1 1      1.65762  _b[]

1     6.171095  _b[1.alder_k]
2     18.07456  _b[2.alder_k]

sex    2.072329  _b[sex]
risk_tid  (exposure)
Hope its not too elementary.....
Thanks you all for your contributions to statalist, it's a really
valuable source of information for me.

Jesper Lindhardsen
MD, Ph.d. student
Department of Cardiovascular Research
Copenhagen University Hospital, Gentofte

*   For searches and help try:
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index