Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to compare models after svy

From   Stas Kolenikov <>
To   "" <>
Subject   Re: st: How to compare models after svy
Date   Wed, 22 May 2013 18:16:36 -0400

To put it simply, the F-distribution is a somewhat more accurate
distribution to use than the large sample chi-square (that essentially
assumes an infinite number of degrees of freedom, which is not the
case in cluster surveys, like -webuse nhanes2-, which only has 31
denominator d.f.) If these F-statistics are the overall tests (you
would need to show all of the output for the list to better advise you
what to do), then they are not exactly commensurate, and cannot be
easily converted to test for the significance of the interactions.
What you would need to do is to form -test (interaction terms)- based
on -svy: logit- results, where the interaction terms would be the
explicit list of your (3x3?) interaction terms.

-- Stas Kolenikov, PhD, PStat (SSC)
-- Senior Survey Statistician, Abt SRBI
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer

On Wed, May 22, 2013 at 7:34 AM, Orian Brook <> wrote:
> Thanks for your reply, Stas. Yes, I'm aware that what I usually use is a
> likelihood ratio test, and I think I understand the explanation given by
> Stata for why the usual logit postestimation commands don't work with
> complex survey data. What I wasn't sure of was whether the F statistic could
> in these circumstance be used to compare logit models, as the references I
> found only referred to using it in relation to least squared regression.
> However, and I'm sure it's my fault, I'm afraid that I found the links you
> gave quite  technical and not too illuminating for how I can actually use
> the Stata output.
> All the reference that I find to using the F statistic to compare models
> require me to compare the change in RSS for each model to the change in the
> degrees of freedom - but Stata doesn't give me this, it gives me the Mean
> Square Model divided by the Mean Square Residual.  I could use each model to
> create predictions and manually calculate the RSS but I'm unsure of whether
> this would work given it's a prediction from a logit.
> If it helps to actually give you my output, the relevant F statistics I'm
> getting are
> F(  21,  16828)    =     65.97 and
> F(  25,  16824)    =     56.93
> Thanks so much for any further clarification
> Orian
> -----Original Message-----
> From:
> [] On Behalf Of Stas Kolenikov
> Sent: 21 May 2013 14:17
> To:
> Subject: Re: st: How to compare models after svy
> What you are used to in logistic regression modeling, and what you refer as
> deviance differences, are essentially likelihood ratio tests (compare the
> likelihoods of two models, transform them to asymptotic chi-square). This
> does not work for complex survey data (or at least not in a way that could
> make it directly usable), so what you would want to do is to use Wald tests
> available after any estimation command with -test- (as opposed to the
> logistic-specific fit test commands or -lrtest-, whichever you were using).
> With -svy-, these tests would produce an F-test, which is roughly chi-square
> divided by its
> (estimated) degrees of freedom. In turn, the estimated degrees of freedom,
> as a first approximation, is the number of primary sampling units minus the
> number of strata; and -svy- uses a more complicated approximation based on
> the eigenvalues / generalized design effects of the parameter estimates
> covariance matrix.
> If you have never heard about Wald tests (or, for that matter, of likelihood
> ratio tests), take a look at Buse (1982),
> Note that it only
> applies to the "standard" i.i.d. data. The generalized design effects are
> introduced and discussed in (I believe) Rao and Scott (1981),
>, reviewed over again in
> Rao and Thomas (2003),
> -- Stas Kolenikov, PhD, PStat (SSC)
> -- Senior Survey Statistician, Abt SRBI
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
> --
> On Tue, May 21, 2013 at 7:52 AM, Orian Brook <> wrote:
>> Hi all
>> I'm hoping for advice on comparing models after using -svy: logit-, as
>> the postestimation commands normally used after logit don't work.
>> I had been working without using the - svy - prefix as I'm controlling
>> for most of the criteria that were used in sampling, and in early
>> iterations of the model the results were very similar. With my final
>> model version, the results given by using or not using -svy - are
>> still similar, but the independent variables I'm interested in are
>> slightly more significant and with slightly stronger effect sizes, so I'm
> in favour of using the prefix!
>> However, in my final version I'm using two interactions (both
>> categorical-categorical), where not all the interaction effects are
>> significant. With -svy- I can compare the change in deviance to the
>> change in degrees of freedom to confirm that the model with
>> interactions is a better fit. Again, using the -svy- prefix, not all
>> of the interaction effects are significant and I want to check whether
>> overall the model with interactions is a better fit, but I'm not given
>> the log likelihood or deviance statistic. I'm given an f-test but I
>> thought this was only appropriate for a least squares model?
>> Thanks all
>> Orian
>> *
>> *   For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index