Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: indicator variable and interaction term different signs but both significant |

Date |
Mon, 8 Apr 2013 14:29:16 -0400 |

Thanks, Richard. I don't have examples of "terrible harms," but the common phrasing can easily mislead, by giving the impression that one can hold the other predictors constant when the data do not support such a statement and (perhaps) by not keeping in view the other predictors, which have been adjusted for (not, in general, "conrolled for"). Rather than continue to torment the poor indicator variable in the initial example, let's look at two aspects of a generic multiple regression: the regression equation and the data. The regression equation, containing all the predictors, does not specify an order in which the predictors entered the model. At that stage they are all included. We could make a partial regression plot for each of them, so the interpretation of the coefficient of each predictor in the model should reflect the fact of the adjustment for the contributions of the other predictors. In other words, all the predictors are in the equation together, and the role of each takes into account the contributions of all the others. Holding the values of the other predictors constant is simply not part of the description of the role of an individual predictor. Once the fitted regression equation is in hand, the analyst must decide how to use it. If the data support predictions that change one variable and hold other variables constant at specified values, well and good. It should usually be possible to make such predictions to at least a limited extent, because we have tacitly assumed that we have a good model. Some sets of data are designed to support such predictions over a sizable region of "predictor space." My point is that the data determine the extent to which an analyst can make such predictions. Thus, the analyst has the obligation to explain which predictions are well supported by the data and which are extrapolations (and, if so, by how much). The correct general interpretation helps to keep that obligation in view, and it avoids the impression that one can assign arbitrary constant values to the other predictors. In some situations it is not hard to imagine that an incautious policymaker would manipulate a single policy-relevant variable and be surprised at the unintended consequences that emerge when other variables change along with it. I have not had time to look at the details of the example that you mentioned. I would have a problem with unqualified extrapolations. It is likely not to be a good idea to make comparisons involving BMI between men and women outside the interval of BMI where one has data from both men and women. If the predictions were accompanied by appropriate confidence intervals, the widths of the CIs might give some warning of the extrapolation, but I would prefer careful examination of the extent of the data. David Hoaglin On Mon, Apr 8, 2013 at 11:28 AM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > Thanks for your detailed response David. I appreciate it and I will go over > it carefully. > > I still wonder when and how the "common phrasing" is often incorrect, and, > more critically, what terrible harms result from using that phrasing. With > the problem that started this discussion, the common phrasing seemed to > provide a straightforward explanation of why one shouldn't get too hung up > on the sign and significance of the OC_D dummy once interactions are in the > model. To me, the "common phrasing" may not technically reflect how > regression works, but it does describe the logical implications of the > regression models once you have estimated them. Your preferred phrasing, on > the other hand, even if it is technically correct, strikes me as being very > difficult to understand and is not at all intuitive. But again, I will go > over your points more carefully. > > I am curious, do you have objections to this example from the Stata Manuals? > > use http://www.stata-press.com/data/r12/nhanes2 > logistic highbp sex##agegrp##c.bmi > margins sex, at(bmi=(10(5)65)) > marginsplot, xlabel(10(10)60) > sum bmi if female > sum bmi if !female > > It includes values of BMI that are out of range for both men and women; and, > perhaps more critically, it includes values that women have that men do not. > i.e. the maximum BMI value for men is 53 and the maximum value for women is > 61, and comparisons are made all the way up to 65. So, do you think it is > only legitimate to do this across the range of 14 to 53, a range which both > men and women have values for? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: indicator variable and interaction term different signs but both significant***From:*Nahla Betelmal <nahlaib@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Anthony Fulginiti <fulginit@usc.edu>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Nahla Betelmal <nahlaib@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Nahla Betelmal <nahlaib@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: indicator variable and interaction term different signs but both significant***From:*Richard Williams <richardwilliams.ndu@gmail.com>

- Prev by Date:
**Re: st: xtserial error: "no observations"** - Next by Date:
**Re: st: xtserial error: "no observations"** - Previous by thread:
**Re: st: indicator variable and interaction term different signs but both significant** - Next by thread:
**Re: st: indicator variable and interaction term different signs but both significant** - Index(es):