Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: logistic regression with orthogonal predictors

Subject   Re: st: logistic regression with orthogonal predictors
Date   Wed, 17 May 2006 08:37:14 -0400


Apologies for not responding sooner.  I was out of the office.  Thanks for 
sharing your handouts.  They were very clear.  I especially liked the RWLS 

I have never been a fan of standardized coefficients in OLS for a number 
of reasons and typically argue against using them, at least for the 
typical reasons used to justify their use.  However, I see the advantage 
you bring up for nested models in logistic regression, and I assume other 
GLMs.  However, in addition to the issue of coefficients increasing in 
size as one adds predictors, one runs into the situation, that cannot be 
attributed to suppression, where predictors that were not statistically 
significant become statistically significant as the variance in Y* 

Thanks again.

Mike Frone

Richard Williams <> 
Sent by:
05/16/2006 09:23 AM
Please respond to

To, <>

Re: st: logistic regression with orthogonal predictors

At 02:35 PM 5/15/2006, wrote:
>A colleague asked me about some results with logistic regression.  He had
>two predictors of a binary outcome, call them A and B.  When used alone,
>predictor A was significantly related to the outcome and predictor B was
>not.  Moreover, the correlation between A and B was zero.  When the
>outcome was regressed on the two predictors simultaneously using logistic
>regression both were significantly related to the outcome.  In effect, 
>coefficient for predictor B became larger.  However, when OLS regression
>was used instead, the coefficients for each predictor were the same as
>when entered alone, which is what one would expect.

To elaborate a bit on my last answer - in OLS, the variance of y is 
the variance of y, i.e. it doesn't matter whether y is regressed on 
X1, or X1 and X2, or X1 and X2 and X3 - the variance of y will be the 
same in every case.

BUT, in logistic regression (also probit and others) the variance of 
the underlying latent variable y* changes as you go from one model to 
the next, i.e. the variance of y* will be different when y is 
regressed on X1 than when it is regressed on X1 and X2.  This is 
because, in a logistic regression, the latent variable is normalized 
by fixing its residual variance at about 3.29 (in probit it is fixed 
at 1).  Since the residual variance is fixed, as more vars are added, 
the explained variance increases, and the total variance of y* 
increased.  In short, with logit and probit, your dv is a moving 
target, i.e. its variance changes from one model to the next.  Hence, 
even when the Xs are uncorrelated, you see behavior such as was 
described in the original message.

The handouts I cited earlier also show that, if you use RWLS (Rich 
Williams's Least Squares - a little known method and deservedly so) 
you can get the same sort of behavior in OLS, i.e. if you fix the 
residual variance at a specific value (e.g. 3.29) then the 
coefficient estimates behave in the same odd ways.

In short, you have to realize that a lot of the things we are used to 
in OLS do not work the same way in logit and probit.  In OLS, our DV 
is an observed variable; in logit and probit, our DV is actually a 
latent unobserved variable (all we see is the 0-1 dichotomy that is 
caused by the undelrying latent variable.)

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX:    (574)288-4373
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW (personal):
WWW (department): 

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index