Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svylogitgof after logistic using subpop option


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: svylogitgof after logistic using subpop option
Date   Tue, 08 Mar 2011 18:27:47 -0600

Maria E. Montez Rath <maria.rath@gmail.com> is trying to perform a
goodness-of-fit test for -svy: logistic- with a subpopulation:

> I just found out that the -estat- Stata manual had been updated and
> now includes the goodness of fit test for binary data. I believe that
> -estat gof- is reporting the F-adjusted mean  residual test according
> to Archer and Lemeshow (2006).
> 
> Reference
> Archer, K. J., and S. Lemeshow. 2006. Goodness-of-fit test for a
> logistic regression model fitted using survey sample data. Stata
> Journal 6: 97--105.

-estat gof- after -svy: logistic- is in fact using the above referenced
method.

> But I still have a problem. I have 10 years of data and so I created a
> smaller dataset that includes my subpopulation augmented by one record
> for each PSU dropped when selecting the subpopulation. In theory this
> should work because the problem with selecting the subpopulation
> directly and doing a conditional analysis is that there is no way of
> the program to know how many PSUs were sampled. By augmenting my
> dataset with the PSUs dropped Stata can still compute n (total number
> of PSUs sampled).  I tested that this would work by comparing the
> results from -svy: logistic- with -subpop()- option using 1) the
> complete one year of data and 2) my augmented data for that same year.
> 
> The results from -svy: logistic- are identical using both methods
> (Point estimates and SEs are equal) but the results from -estat gof-
> are very different where using the entire data the test indicates a
> lack of fit while using my augmented data the test indicates good fit.
> 
> So, I'm still wondering how does -estat gof- uses the results from
> -svy: logistic- with the subpopulation option.

At present, neither -svylogitgof- nor -estat gof- do anything to account for
subpopulation estimation.

Since the original article does not specifically address subpopulation
estimation, it is not immediately clear how -estat gof- can be changed to
handle subpopulation estimation results.  We will add this to our research and
development list.

In the short term, we will change -estat gof- to report a warning when it is
used with subpopulation estimation results.

--Jeff
jpitblado@stata.com

> Using ALL data:
> 
> . use pah08
> . svy, subpop(pah): logistic dead i.aki2 i.diabetes i.mec_vent i.fem
> . estat gof if newpah==1
> 
> Logistic model for dead, goodness-of-fit test
> 
>                    F(9,961) =      3126.59
>                    Prob > F =         0.0000
> 
> Using AUGMENTED data:
> 
> . use pahsubpop08, clear
> . svy, subpop(pah): logistic died i.aki2 i.diabetes i.mec_vent i.fem
> . estat gof
> 
> Logistic model for died, goodness-of-fit test
> 
>                    F(9,961) =         0.66
>                    Prob > F =         0.7500

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index