Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svylogitgof after logistic using subpop option


From   "Maria E. Montez Rath" <[email protected]>
To   [email protected]
Subject   Re: st: svylogitgof after logistic using subpop option
Date   Sun, 6 Mar 2011 22:20:51 -0800

Thank you for your response. I've read the FAQ but missed that
important detail. Sorry.

Maria

On Sun, Mar 6, 2011 at 5:41 PM, Steven Samuels <[email protected]> wrote:
> -
>
> -svylogitgof- is not official command. The Statalist FAQ request that you identify non-official commands as such and say where you got them.  -svylogit- is not "subpopulation-aware". To use use it for a subpopulation, you will have to first run of -svy logistic- with an -if- clause, not the subpop() option.
>
>
> Steve
> [email protected]
>
> On Mar 6, 2011, at 7:21 PM, Maria E. Montez Rath wrote:
>
> Hi!
>
> I'm using the NIS which follows a complex survey design to obtain the
> odds of dying for patients with acute kidney disease in a
> subpopulation. I'll be using 10 years of data which will make the
> dataset too big. Since I'm interested in a subpopulation I found out
> that in order to obtain correct standard errors, my dataset only needs
> to include the subpopulation plus one record for each PSU that would
> be dropped when creating the subpopulation dataset. This way, I can
> still use the svy, subpop(): logistic command because Stata can still
> compute the total number of hospitals sampled.
>
> While testing this theory I found that Stata will give me the same
> results whether I use the entire sample or my augmented subpopulation
> data but the goodness of fit test using svylogitgof is very different.
> I also found that svylogitgof is reporting the number of observations
> in the total sample and not the subpopulation number of observations.
> Does this have any implication in the actual test?
>
> Below you can see the results from my test. First, is the output using
> the entire dataset and second using my augmented subpopulation
> dataset.
>
> The output from svy logistic is identical with the only difference
> being the population size reported which is wrong on my augmented
> dataset as it should be. However, all the results (ORs, SE, t,...) are
> equal.
>
> The output for the goodness of fit test is very different. As you can
> see, the number of observations reported are the total number of
> observations in the data even though I'm doing a subpopulation
> analysis. We see that the number of groups used is different and using
> the entire dataset the test rejects the hypothesis of model is a good
> fit, but using my augmented dataset we do not reject the hypothesis
> that the model is a good fit. But they are the same model, so how can
> I have such different analysis?
>
> I have read the paper on the test and I don't see where the number of
> observations come into play. Also, in the paper it was assumed that
> the number of groups used was 10 (generating deciles of risk). In the
> new svylogitgof update, this was changed to vary.
>
> Can anyone help me? I don't know what to make of these results and I
> surely cannot use them as I don't think the test applied to the entire
> dataset is also correct.
>
> Thank you,
>
> Maria
>
> Using ALL data:
>
> . svy, subpop(pah): logistic dead i.diabetes i.aki2 i.mec_vent i.fem
> Survey: Logistic regression
>
> Number of strata   =        58                  Number of obs       =   8104197
> Number of PSUs     =      1027              Population size      =  39615465
>                                                          Subpop. no.
> of obs =         1971
>                                                          Subpop. size
>        = 9686.4649
>                                                          Design df
>         =           969
>                                                         F(   4,
> 966)        =         27.18
>                                                         Prob > F
>        =       0.0000
>
> -------------------------------------------------------------------------------
>                 |             Linearized
>         dead | Odds Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>       1.aki2 |   3.511044   .8979891     4.91   0.000     2.125493    5.799799
>  1.diabetes |   .4748568   .1459044    -2.42   0.016     .2598337    .8678202
> 1.mec_vent |   9.576589   2.515918     8.60   0.000     5.718832    16.03668
>        1.fem |    1.88229   .5211665     2.28   0.023     1.093231    3.240866
> ------------------------------------------------------------------------------
> Note: 2 strata omitted because they contain no subpopulation members.
>
> . svylogitgof
>  Number of observations =                         8104197
>  F-adjusted test statistic = F(3,967) =       7865.271
>                       Prob > F =                             0.000
>
>
> Using AUGMENTED subpopulation data:
>
> . svy, subpop(pah): logistic died i.aki2 i.diabetes i.mec_vent i.fem
> Survey: Logistic regression
>
> Number of strata   =        58                 Number of obs       =
>     2565
> Number of PSUs     =      1027             Population size      = 12682.585
>                                                        Subpop. no.
> of obs =         1971
>                                                        Subpop. size
>      = 9686.4649
>                                                        Design df
>        =           969
>                                                        F(   4,
> 966)        =         27.18
>                                                        Prob > F
>       =       0.0000
>
> ------------------------------------------------------------------------------
>                |                 Linearized
>          died | Odds Ratio   Std. Err.      t    P>|t|     [95%
> Conf. Interval]
>   -------------+----------------------------------------------------------------
>    1.aki2    |   3.511044   .8979891     4.91   0.000     2.125493    5.799799
> 1.diabetes  |   .4748568   .1459044    -2.42   0.016     .2598337    .8678202
> 1.mec_vent |   9.576589   2.515918     8.60   0.000     5.718832    16.03668
>  1.female   |    1.88229   .5211665     2.28   0.023     1.093231    3.240866
> ------------------------------------------------------------------------------
> Note: 2 strata omitted because they contain no subpopulation members.
>
> . svylogitgof
>  Number of observations =                            2565
>  F-adjusted test statistic  = F(5,965) =          1.096
>                       Prob > F =                           0.361
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index