Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Understanding Factor variables - is order significant ?


From   rgutierrez@stata.com (Roberto G. Gutierrez, StataCorp)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Understanding Factor variables - is order significant ?
Date   Wed, 26 May 2010 15:17:11 -0500

Jesper Lindhardsen <JESLIN01@geh.regionh.dk> writes:

> I am having a hard time understanding why 2 regression models that differ
> only by the "order" of the included factor variables yield different
> results???  I can't (or am too slow to) find the answer in the
> documentation, but I think it is related to the parsing of the baselevel
> specifiers (see model 1 legend = _b[0o.ra#0b.dm] ???).

> Here are the 2 commands and resulting output - as you can see I've only
> changed b1.ra#b0.dm to b0.dm#b1.ra. Output has been edited, but only left
> out if identical between models.

Jesper (and those others who have contributed on this thread) have discovered
a bug in how factor-variable interactions are being parsed in Stata.  The
specific conditions that trigger this are as follows:

   1. You specify a simple interaction (a single # sign) between two or more
      factor variables.

   2. The first variable in the interaction has the value zero as one of
      its categories.

   3. The first specification in the interaction has a base level that is 
      not the default of zero (the lowest level for the first variable).

   4. At least one of the remaining variables in the interaction has a base
      equal to the lowest-valued category for that variable, whether expicitly
      specified or taken as the default.

   5. Almost all estimation commands are affected by this bug, with -regress-
      being one notable exception.

When the above conditions occur, Stata is attempting to omit an extra cell in
the interaction.  Sometimes, the cell will be omitted altogether, other times
Stata will produce a coefficient for that cell, but missing standard errors
and confidence intervals.  Either way, the model fit is thrown off because the
cell's coefficient is not properly estimated.

We will fix this in the next executable update, to be made available soon.

--Bobby						--Jeff
rgutierrez@stata.com				jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index