Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Multicollinearity problem in Logistic survival analysis


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Multicollinearity problem in Logistic survival analysis
Date   Sat, 27 Mar 2010 09:25:25 +0000 (GMT)

--- On Sat, 27/3/10, Lu, Zhenyan wrote:
> In my research I have 6 variables that are highly
> correlated, correlation value up to .71 to .83 based on
> large samples (n>140,000). <snip> So I am really 
> concerned about the potential problem in the model. 

By adding multiple explanatory variables you want to be
able to distinguish between them. If two variables are
*perfectly* correlated, how would you be able to distinguis
between the two? This is why Stata will drop variables 
when there is perfect correlation. If two variables are 
strongly but not perfectly correlated, then that means that
it will be more difficult for Stata (or any other statistical
software package) to distinguish the effects of the two 
variables. This leads to higher standard errors, which is
exactly as it should be: It is more difficult to distinguish
the variables, so we are more uncertain about the results,
so the standard errors should be larger. In other words 
there is no problem.

> And even more complicated is that I have to include square
> terms for each of these 6 variables in the model at the
> same time to test the curvilinear relationship. 

Adding square terms is a very limited way of checking for
curvilinearity. I like the linear spline (see: 
- help mkspline-) as good compromise between a flexible 
non-linear curve and parameters with an easy interpretation.

Others like more smooth non-linear curves like restricted 
cubic splines or fractional polynomials. If you want to 
interpret the results of those curves you'll have to make 
graphs. 

For restricted cubic splines see:
http://www.stata.com/meeting/sweden09/se09_orsini.pdf
and
http://ideas.repec.org/p/boc/dsug09/04.html

For fractional polynomials see: -help fracpoly-.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index