Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svylogitgof after logistic using subpop option

From (Jeff Pitblado, StataCorp LP)
Subject   Re: st: svylogitgof after logistic using subpop option
Date   Tue, 08 Mar 2011 18:27:47 -0600

Maria E. Montez Rath <> is trying to perform a
goodness-of-fit test for -svy: logistic- with a subpopulation:

> I just found out that the -estat- Stata manual had been updated and
> now includes the goodness of fit test for binary data. I believe that
> -estat gof- is reporting the F-adjusted mean  residual test according
> to Archer and Lemeshow (2006).
> Reference
> Archer, K. J., and S. Lemeshow. 2006. Goodness-of-fit test for a
> logistic regression model fitted using survey sample data. Stata
> Journal 6: 97--105.

-estat gof- after -svy: logistic- is in fact using the above referenced

> But I still have a problem. I have 10 years of data and so I created a
> smaller dataset that includes my subpopulation augmented by one record
> for each PSU dropped when selecting the subpopulation. In theory this
> should work because the problem with selecting the subpopulation
> directly and doing a conditional analysis is that there is no way of
> the program to know how many PSUs were sampled. By augmenting my
> dataset with the PSUs dropped Stata can still compute n (total number
> of PSUs sampled).  I tested that this would work by comparing the
> results from -svy: logistic- with -subpop()- option using 1) the
> complete one year of data and 2) my augmented data for that same year.
> The results from -svy: logistic- are identical using both methods
> (Point estimates and SEs are equal) but the results from -estat gof-
> are very different where using the entire data the test indicates a
> lack of fit while using my augmented data the test indicates good fit.
> So, I'm still wondering how does -estat gof- uses the results from
> -svy: logistic- with the subpopulation option.

At present, neither -svylogitgof- nor -estat gof- do anything to account for
subpopulation estimation.

Since the original article does not specifically address subpopulation
estimation, it is not immediately clear how -estat gof- can be changed to
handle subpopulation estimation results.  We will add this to our research and
development list.

In the short term, we will change -estat gof- to report a warning when it is
used with subpopulation estimation results.


> Using ALL data:
> . use pah08
> . svy, subpop(pah): logistic dead i.aki2 i.diabetes i.mec_vent i.fem
> . estat gof if newpah==1
> Logistic model for dead, goodness-of-fit test
>                    F(9,961) =      3126.59
>                    Prob > F =         0.0000
> Using AUGMENTED data:
> . use pahsubpop08, clear
> . svy, subpop(pah): logistic died i.aki2 i.diabetes i.mec_vent i.fem
> . estat gof
> Logistic model for died, goodness-of-fit test
>                    F(9,961) =         0.66
>                    Prob > F =         0.7500

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index