Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: logistic regression with orthogonal predictors


From   frone@ria.buffalo.edu
To   <statalist@hsphsun2.harvard.edu>
Subject   st: logistic regression with orthogonal predictors
Date   Mon, 15 May 2006 15:35:15 -0400

A colleague asked me about some results with logistic regression.  He had 
two predictors of a binary outcome, call them A and B.  When used alone, 
predictor A was significantly related to the outcome and predictor B was 
not.  Moreover, the correlation between A and B was zero.  When the 
outcome was regressed on the two predictors simultaneously using logistic 
regression both were significantly related to the outcome.  In effect, the 
coefficient for predictor B became larger.  However, when OLS regression 
was used instead, the coefficients for each predictor were the same as 
when entered alone, which is what one would expect.

So I tried a little experiment.  I selected a binary outcome and two 
predictors that were moderately correlated, age and job tenure, r = .44. 

I regressed the binary outcome on each variable separately and together 
using OLS and logistic regression.   I obtained the same pattern of 
results across logistic and OLS regress.  By themselves, both age and job 
tenure were significant predictors of the outcome.  But when entered 
together, only age was significant. 

Then I created versions of age and job tenure that were orthogonal using 
-orthog-, basically taking out the variance in job tenure attributable to 
age (the more important predictor of the outcome). 

I again regressed the binary outcome on each orthogonal variable 
separately and together.  By themselves, as expected, age was significant 
and job tenure was not in both OLS and logistic regression.  But here is 
the crux of the issue: 

When I regress the binary outcome on the two orthogonal predictors using 
OLS regression their regression coefficients, reported to 8 decimal 
places, were identical to the coefficients I obtained when they were 
entered separately. 

In contrast, when I regressed the binary outcome on the two orthogonal 
predictors using logistic regression, their regression coefficients were 
not the same as obtained when treated separately.  The coefficients for 
the highly significant predictor, age, were nearly identical:

a) when entered by itself:  b =  -.7376565, p = 0.000

b) when entered with age:  b = -.7450136, p = 0.000

However, this is what I obtained for job tenure: 

c) when entered by itself:  b = -.0704451, p = 0.227 

d) when entered with age:  b = -.1363843, p = 0.097 

It's not clear to us why this happens.  In both our cases, the variable 
affected the most is not related to the other predictor and has either no 
relation or weak nonsignificant relation to the outcome on its own.  But a 
nonsignificant variable can become statistically significant--in the 
original case and almost so in this case.  Yet there is no such issue with 
linear regression.   Is this just a trivial issue with a marginal 
predictor or is there some more general issue?

Mike Frone


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index