Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Orian Brook <ob11@st-andrews.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: How to compare models after svy |
Date | Wed, 22 May 2013 12:34:26 +0100 |
Thanks for your reply, Stas. Yes, I'm aware that what I usually use is a likelihood ratio test, and I think I understand the explanation given by Stata for why the usual logit postestimation commands don't work with complex survey data. What I wasn't sure of was whether the F statistic could in these circumstance be used to compare logit models, as the references I found only referred to using it in relation to least squared regression. However, and I'm sure it's my fault, I'm afraid that I found the links you gave quite technical and not too illuminating for how I can actually use the Stata output. All the reference that I find to using the F statistic to compare models require me to compare the change in RSS for each model to the change in the degrees of freedom - but Stata doesn't give me this, it gives me the Mean Square Model divided by the Mean Square Residual. I could use each model to create predictions and manually calculate the RSS but I'm unsure of whether this would work given it's a prediction from a logit. If it helps to actually give you my output, the relevant F statistics I'm getting are F( 21, 16828) = 65.97 and F( 25, 16824) = 56.93 Thanks so much for any further clarification Orian -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Stas Kolenikov Sent: 21 May 2013 14:17 To: statalist@hsphsun2.harvard.edu Subject: Re: st: How to compare models after svy What you are used to in logistic regression modeling, and what you refer as deviance differences, are essentially likelihood ratio tests (compare the likelihoods of two models, transform them to asymptotic chi-square). This does not work for complex survey data (or at least not in a way that could make it directly usable), so what you would want to do is to use Wald tests available after any estimation command with -test- (as opposed to the logistic-specific fit test commands or -lrtest-, whichever you were using). With -svy-, these tests would produce an F-test, which is roughly chi-square divided by its (estimated) degrees of freedom. In turn, the estimated degrees of freedom, as a first approximation, is the number of primary sampling units minus the number of strata; and -svy- uses a more complicated approximation based on the eigenvalues / generalized design effects of the parameter estimates covariance matrix. If you have never heard about Wald tests (or, for that matter, of likelihood ratio tests), take a look at Buse (1982), http://www.citeulike.org/user/ctacmo/article/890474. Note that it only applies to the "standard" i.i.d. data. The generalized design effects are introduced and discussed in (I believe) Rao and Scott (1981), http://www.citeulike.org/user/ctacmo/article/1036968, reviewed over again in Rao and Thomas (2003), http://www.citeulike.org/user/ctacmo/article/8922395. -- Stas Kolenikov, PhD, PStat (SSC) -- Senior Survey Statistician, Abt SRBI -- Opinions stated in this email are mine only, and do not reflect the position of my employer -- http://stas.kolenikov.name On Tue, May 21, 2013 at 7:52 AM, Orian Brook <ob11@st-andrews.ac.uk> wrote: > Hi all > > I'm hoping for advice on comparing models after using -svy: logit-, as > the postestimation commands normally used after logit don't work. > > I had been working without using the - svy - prefix as I'm controlling > for most of the criteria that were used in sampling, and in early > iterations of the model the results were very similar. With my final > model version, the results given by using or not using -svy - are > still similar, but the independent variables I'm interested in are > slightly more significant and with slightly stronger effect sizes, so I'm in favour of using the prefix! > However, in my final version I'm using two interactions (both > categorical-categorical), where not all the interaction effects are > significant. With -svy- I can compare the change in deviance to the > change in degrees of freedom to confirm that the model with > interactions is a better fit. Again, using the -svy- prefix, not all > of the interaction effects are significant and I want to check whether > overall the model with interactions is a better fit, but I'm not given > the log likelihood or deviance statistic. I'm given an f-test but I > thought this was only appropriate for a least squares model? > > Thanks all > > Orian > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/