[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: Testing nested models using logistic regression with robust standard errors |

Date |
Mon, 28 Apr 2008 14:47:43 -0700 |

There may be a greater horror in store for us when we try to develop models - if there are missing values, the number of observations in each of the runs likely will differ. The variables you select will depend on the order in which you drop them... there are no good solutions for this. I've tried one possibility which is to require e(sample)=1 from the full model and then continue - this is equivalent (I think to a backward stepping model which is yukky. Another possibility is to use multiple imputation and then drop the least significant variables. You can't use backward stepping, but it's a simple process with ice and mim. Anyway, beware the missing value. Tony Peter A. Lachenbruch Department of Public Health Oregon State University Corvallis, OR 97330 Phone: 541-737-3832 FAX: 541-737-4001 -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of John LeBlanc Sent: Monday, April 28, 2008 1:56 PM To: Nick Cox Subject: Re: st: RE: Testing nested models using logistic regression with robust standard errors Thanks for the reply. I take your point about the limitations of sw regression and I will be more hesitant in using them. However, whether one uses sw or whether a more appropriate theory-driven approach with thoughtful removal of variables, there is still a problem of testing whether a more parsimonious model differs in the fit of the data from its more saturated model. Is there any alternative to lrtest that is appropriate for robust SE? Is the problem that one can't really specify the error distributions of these models when robust SE are used? On Mon, 28 Apr 2008 19:49:10 +0100, Nick Cox wrote: I can't answer your deeper question about nested models. The simpler question here is about the decision rule to drop variables during the stepwise procedure. Stata is using precisely the decision rule you specified in your command, -pr(0.2)-. That is the significance level for removal, as shown in action in your output. If you specify robust standard errors, what does and does not satisfy this rule may well change, as with different standard errors different significance levels will be calculated, but again you get what you ask for. On what you should do, that depends on how seriously you take the advice of Frank Harrell and others that stepwise methods are generally a bad idea. (Google for sources.) Similarly, every expert has a different way to balance parsiomony and goodness of fit, and I would not want to try to add another. Nick n.j.cox@durham.ac.uk John LeBlanc, reporting a query from Magda Szumilas I'm a graduate student who is new to Stata. For my thesis, I'm trying to figure out how I can test nested models when I'm forced to use robust standard errors. Stata tells me that I can't use lrtest and I understand that, since it depends on maximum likelihood estimates. So what does one use? Here's what I did. Having done an initial backwards stepwise logistic regression at pr(0.2), I would like to manually create a parsimonious model with the best possible fit. I assume that Stata is using some decision rule to drop variables during the stepwise procedure; is this what I should use when I try to drop them manually? What is Stata's decision rule for stepwise logistic regression using robust standard errors? I found nothing in the manual and nothing helpful after extensive searching on the web. ************************************** An example below: . xi: sw logistic usemh3 i.grade sexorcat markcat partcat livecat edumomcat edudadcat sexriskcat anysmoke if sex==1, cluster(site) pr(0.2) i.grade _Igrade_10-12 (naturally coded; _Igrade_10 omitted) begin with full model p = 0.6664 >= 0.2000 removing markcat p = 0.6006 >= 0.2000 removing edumomcat p = 0.5856 >= 0.2000 removing _Igrade_12 p = 0.2054 >= 0.2000 removing sexorcat p = 0.2113 >= 0.2000 removing _Igrade_11 p = 0.2592 >= 0.2000 removing partcat Logistic regression Number of obs = 580 Wald chi2(1) = .. Prob > chi2 = .. Log pseudolikelihood = -266.26595 Pseudo R2 = 0.0691 (Std. Err. adjusted for 3 clusters in site) ------------------------------------------------------------------------ ------ | Robust usemh3 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------- ------ livecat | .5896426 .1786052 -1.74 0.081 .325654 1.067631 edudadcat | 1.602875 .1808557 4.18 0.000 1.284863 1.999597 sexriskcat | .4266733 .0246379 -14.75 0.000 .3810162 .4778014 anysmoke | 2.502815 .266854 8.60 0.000 2.030824 3.084503 ------------------------------------------------------------------------ ------ . estimates store full . xi: sw logistic usemh3 i.grade sexorcat markcat partcat livecat edumomcat edudadcat anysmoke if sex==1, cluster(site) pr(0.2) i.grade _Igrade_10-12 (naturally coded; _Igrade_10 omitted) begin with full model p = 0.6856 >= 0.2000 removing markcat p = 0.5475 >= 0.2000 removing _Igrade_12 p = 0.2756 >= 0.2000 removing sexorcat p = 0.2803 >= 0.2000 removing partcat p = 0.2756 >= 0.2000 removing _Igrade_11 Logistic regression Number of obs = 600 Wald chi2(1) = .. Prob > chi2 = .. Log pseudolikelihood = -284.2349 Pseudo R2 = 0.0489 (Std. Err. adjusted for 3 clusters in site) ------------------------------------------------------------------------ ------ | Robust usemh3 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------- ------ livecat | .7364448 .1749249 -1.29 0.198 .4623359 1.173067 edudadcat | 1.366027 .2559876 1.66 0.096 .9461231 1.97229 edumomcat | 1.35079 .278478 1.46 0.145 .901788 2.02335 anysmoke | 2.571288 .1562286 15.54 0.000 2.282615 2.896468 ------------------------------------------------------------------------ ------ . lrtest full LR test likely invalid for models with robust vce r(498); * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: RE: Testing nested models using logistic regression with robust standard errors***From:*Richard Williams <Richard.A.Williams.5@ND.edu>

**References**:**st: RE: Testing nested models using logistic regression with robust standard errors***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: RE: Testing nested models using logistic regression with robust standard errors***From:*John LeBlanc <leblancj@dal.ca>

- Prev by Date:
**Re: st: RE: Testing nested models using logistic regression with robust standard errors** - Next by Date:
**st: MadFuller with gaps** - Previous by thread:
**Re: st: RE: Testing nested models using logistic regression with robust standard errors** - Next by thread:
**RE: st: RE: Testing nested models using logistic regression with robust standard errors** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |