Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: multicollinearity with survey data


From   Christine Gourin <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: multicollinearity with survey data
Date   Wed, 23 Feb 2011 15:33:02 -0500

Thank you VERY much!

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Steven Samuels
Sent: Wednesday, February 23, 2011 9:01 AM
To: [email protected]
Subject: Re: st: multicollinearity with survey data

--

On Feb 23, 2011, at 12:03 AM, Christine Gourin wrote:

> thank you!
> how do you test for collinearity with survey data, however?

Christine:

1. Variables can get dropped from any type of regression because of collinearity. However in logistic regression,  they can also get dropped if a category gives perfect prediction, as you found. 

2. To measure collinearity,  you used e(r2) , the overall multiple R-square after  -svy: reg-.  However R-square says nothing about collinearity.  It quantifies association of the outcome with all the predictors. The r-squares used in the definitions of VIF come from the regressions of each predictor on the others.

3. The weights are the only part of the survey design that enter the estimation of the VIF.  Therefore, to test for collinearity with survey data,  run (non-survey) -regress- with a [pw] option; then use -estat vif-.  

**********************************
sysuse auto, clear
reg mpg weight turn price [pw=length]
estat vif
*********************************

Steve

Steven J. Samuels
Consulting Statistician
18 Cantine's Island
Saugerties, NY 12477 USA
Voice: 845-246-0774
Fax:   206-202-4783 
[email protected]


> On Feb 22, 2011, at 11:55 AM, Christine Gourin wrote:
> 
> i have a question about how to check for multicollinearity with survey data. the only information I can find about this is at the site
> http://www.stata.com/support/faqs/res/statalist.html#toask
> 
> I am using survey data to investigate variables associated with hospital volume (HVH) as the dependent variable.
> I suspect that teaching status (HOSP_TEACH) is collinear with HVH, as all HVH hospitals are teaching hospitals.
> 
> I am not sure how to check for multicollinearity in the full model, which is
> 
> 
> xi: svy: logistic HVH elective i.agecat flap neckdissection i.procedure i.payor radiation HOSP_TEACH  i.RACE i.comorbidity
> 
> 
> 
> when I run this model, stata drops HOSP_TEACH saying it predicts failure perfectly.
> 

This message has nothing to do with multicollinearity.  Multicollinearity concerns the correlations of predictors with each other. This message, refers to the association of outcome and one predictor.  Tabulating HVH against HOSP_TEACH should show you the problem.


Steve

Steven J. Samuels
Consulting Statistician
18 Cantine's Island
Saugerties, NY 12477 USA
Voice: 845-246-0774
Fax:   206-202-4783
[email protected]





*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index