Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: How to compare models after svy

From	Orian Brook <[email protected]>
To	<[email protected]>
Subject	RE: st: How to compare models after svy
Date	Thu, 23 May 2013 10:22:23 +0100

Thanks Stas, that's helped a lot. So I was right, I couldn't make the test
that I wanted to using the Stata output about the models. But using the Wald
test below finds that holding the interaction effects at zero doesn't
provoke a significant reduction in the explanatory power of the model (ie I
shouldn't include them)

Thanks again for your help

Orian

Adjusted Wald test

 ( 1)  X_1*Y_1 = 0
 ( 2)  X_1*Y_3 = 0
 ( 3)  X_3*Y_1 = 0
 ( 4)  X_1*Y_3 = 0

       F(  4, 16845) =    2.08
            Prob > F =    0.0812

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Stas Kolenikov
Sent: 22 May 2013 23:17
To: [email protected]
Subject: Re: st: How to compare models after svy

To put it simply, the F-distribution is a somewhat more accurate
distribution to use than the large sample chi-square (that essentially
assumes an infinite number of degrees of freedom, which is not the case in
cluster surveys, like -webuse nhanes2-, which only has 31 denominator d.f.)
If these F-statistics are the overall tests (you would need to show all of
the output for the list to better advise you what to do), then they are not
exactly commensurate, and cannot be easily converted to test for the
significance of the interactions.
What you would need to do is to form -test (interaction terms)- based on
-svy: logit- results, where the interaction terms would be the explicit list
of your (3x3?) interaction terms.

-- Stas Kolenikov, PhD, PStat (SSC)
-- Senior Survey Statistician, Abt SRBI
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name



On Wed, May 22, 2013 at 7:34 AM, Orian Brook <[email protected]> wrote:
> Thanks for your reply, Stas. Yes, I'm aware that what I usually use is 
> a likelihood ratio test, and I think I understand the explanation 
> given by Stata for why the usual logit postestimation commands don't 
> work with complex survey data. What I wasn't sure of was whether the F 
> statistic could in these circumstance be used to compare logit models, 
> as the references I found only referred to using it in relation to least
squared regression.
> However, and I'm sure it's my fault, I'm afraid that I found the links 
> you gave quite  technical and not too illuminating for how I can 
> actually use the Stata output.
>
> All the reference that I find to using the F statistic to compare 
> models require me to compare the change in RSS for each model to the 
> change in the degrees of freedom - but Stata doesn't give me this, it 
> gives me the Mean Square Model divided by the Mean Square Residual.  I 
> could use each model to create predictions and manually calculate the 
> RSS but I'm unsure of whether this would work given it's a prediction from
a logit.
>
> If it helps to actually give you my output, the relevant F statistics 
> I'm getting are
>
> F(  21,  16828)    =     65.97 and
> F(  25,  16824)    =     56.93
>
> Thanks so much for any further clarification
>
> Orian
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Stas 
> Kolenikov
> Sent: 21 May 2013 14:17
> To: [email protected]
> Subject: Re: st: How to compare models after svy
>
> What you are used to in logistic regression modeling, and what you 
> refer as deviance differences, are essentially likelihood ratio tests 
> (compare the likelihoods of two models, transform them to asymptotic 
> chi-square). This does not work for complex survey data (or at least 
> not in a way that could make it directly usable), so what you would 
> want to do is to use Wald tests available after any estimation command 
> with -test- (as opposed to the logistic-specific fit test commands or
-lrtest-, whichever you were using).
> With -svy-, these tests would produce an F-test, which is roughly 
> chi-square divided by its
> (estimated) degrees of freedom. In turn, the estimated degrees of 
> freedom, as a first approximation, is the number of primary sampling 
> units minus the number of strata; and -svy- uses a more complicated 
> approximation based on the eigenvalues / generalized design effects of 
> the parameter estimates covariance matrix.
>
> If you have never heard about Wald tests (or, for that matter, of 
> likelihood ratio tests), take a look at Buse (1982), 
> http://www.citeulike.org/user/ctacmo/article/890474. Note that it only 
> applies to the "standard" i.i.d. data. The generalized design effects 
> are introduced and discussed in (I believe) Rao and Scott (1981), 
> http://www.citeulike.org/user/ctacmo/article/1036968, reviewed over 
> again in Rao and Thomas (2003),
http://www.citeulike.org/user/ctacmo/article/8922395.
>
> -- Stas Kolenikov, PhD, PStat (SSC)
> -- Senior Survey Statistician, Abt SRBI
> -- Opinions stated in this email are mine only, and do not reflect the 
> position of my employer
> -- http://stas.kolenikov.name
>
>
>
> On Tue, May 21, 2013 at 7:52 AM, Orian Brook <[email protected]>
wrote:
>> Hi all
>>
>> I'm hoping for advice on comparing models after using -svy: logit-, 
>> as the postestimation commands normally used after logit don't work.
>>
>> I had been working without using the - svy - prefix as I'm 
>> controlling for most of the criteria that were used in sampling, and 
>> in early iterations of the model the results were very similar. With 
>> my final model version, the results given by using or not using -svy 
>> - are still similar, but the independent variables I'm interested in 
>> are slightly more significant and with slightly stronger effect 
>> sizes, so I'm
> in favour of using the prefix!
>> However, in my final version I'm using two interactions (both 
>> categorical-categorical), where not all the interaction effects are 
>> significant. With -svy- I can compare the change in deviance to the 
>> change in degrees of freedom to confirm that the model with 
>> interactions is a better fit. Again, using the -svy- prefix, not all 
>> of the interaction effects are significant and I want to check 
>> whether overall the model with interactions is a better fit, but I'm 
>> not given the log likelihood or deviance statistic. I'm given an 
>> f-test but I thought this was only appropriate for a least squares model?
>>
>> Thanks all
>>
>> Orian
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: How to compare models after svy
  - From: Orian Brook <[email protected]>
- Re: st: How to compare models after svy
  - From: Stas Kolenikov <[email protected]>
- RE: st: How to compare models after svy
  - From: Orian Brook <[email protected]>
- Re: st: How to compare models after svy
  - From: Stas Kolenikov <[email protected]>

Prev by Date: Re: st: Re: svmat is changing numbers - a rounding problem?
Next by Date: st: ivqreg user-written code
Previous by thread: Re: st: How to compare models after svy
Next by thread: st: Need help for calculation across observations within variable
Index(es):
- Date
- Thread