Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: estat gof (Hosmer & Lemeshow) after svy:logistic (survey)


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: estat gof (Hosmer & Lemeshow) after svy:logistic (survey)
Date   Wed, 17 Jul 2013 18:05:24 -0500

See: http://www.stata.com/statalist/archive/2011-03/msg00550.html

Steve
[email protected]

On Jul 17, 2013, at 5:23 AM, Ángel Rodríguez Laso wrote:

Dear Statalisters,

Working with Stata 12.1.


If I carry out the following logistic regression in a survey setting
and then type estat gof I get:


. svy, subpop(if disdesjub==1 & disdestr==1 & trab==1 & dismy50==1 &
proxy==2 & edad_c>=60): logistic discAVD edad_c i.sexo i. estud4
i.difinmes3
(running logistic on estimation sample)

Survey: Logistic regression

Number of strata   =        41                  Number of obs      =      1727
Number of PSUs     =       234                  Population size    = 1347,0862
                                               Subpop. no. of obs =       710
                                               Subpop. size       =    563,75
                                               Design df          =       193
                                               F(   7,    187)    =      8,32
                                               Prob > F           =    0,0000

------------------------------------------------------------------------------
            |             Linearized
    discAVD | Odds Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     edad_c |       1,10       0,02     4,42   0,000         1,05        1,15
            |
       sexo |
         1  |       1,00  (base)
         2  |       2,60       0,82     3,02   0,003         1,39        4,84
            |
     estud4 |
         0  |       1,00  (base)
         1  |       0,87       0,32    -0,38   0,704         0,43        1,78
         2  |       0,90       0,40    -0,24   0,807         0,37        2,16
         3  |       0,60       0,27    -1,14   0,257         0,24        1,47
            |
  difinmes3 |
         0  |       1,00  (base)
         1  |       1,59       0,57     1,31   0,190         0,79        3,21
         2  |       3,33       1,20     3,35   0,001         1,64        6,77
            |
      _cons |       0,00       0,00    -5,88   0,000         0,00        0,00
------------------------------------------------------------------------------

.
end of do-file

. estat gof
estat gof is not allowed after subpopulation estimations
r(198);



Then I change if statements for my subpopulation especifications:


. svy: logistic discAVD edad_c i.sexo i.estud4 i.difinmes3 if
disdesjub==1 & disdestr==1 & trab==1 & dismy50==1 & proxy==2 &
edad_c>=60
(running logistic on estimation sample)

Survey: Logistic regression

Number of strata   =        41                  Number of obs      =       710
Number of PSUs     =       193                  Population size    =    563,75
                                               Design df          =       152
                                               F(   7,    146)    =      8,35
                                               Prob > F           =    0,0000

------------------------------------------------------------------------------
            |             Linearized
    discAVD | Odds Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     edad_c |       1,10       0,02     4,41   0,000         1,05        1,15
            |
       sexo |
         1  |       1,00  (base)
         2  |       2,60       0,82     3,02   0,003         1,39        4,85
            |
     estud4 |
         0  |       1,00  (base)
         1  |       0,87       0,32    -0,38   0,707         0,42        1,79
         2  |       0,90       0,40    -0,25   0,807         0,37        2,16
         3  |       0,60       0,27    -1,15   0,254         0,24        1,46
            |
  difinmes3 |
         0  |       1,00  (base)
         1  |       1,59       0,56     1,32   0,189         0,79        3,21
         2  |       3,33       1,18     3,39   0,001         1,65        6,72
            |
      _cons |       0,00       0,00    -5,88   0,000         0,00        0,00
------------------------------------------------------------------------------

. estat gof

Logistic model for discAVD, goodness-of-fit test

                    F(9,144) =       110,29
                    Prob > F =         0,0000



But if I get rid of the survey especifications, I get:

. logistic discAVD edad_c i.sexo i.estud4 i.difinmes3 if disdesjub==1
& disdestr==1 & trab==1 & dismy50==1 & proxy==2 & edad_c>=60

Logistic regression                               Number of obs   =        710
                                                 LR chi2(7)      =      65,87
                                                 Prob > chi2     =     0,0000
Log likelihood = -210,78135                       Pseudo R2       =     0,1351

------------------------------------------------------------------------------
    discAVD | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     edad_c |       1,10       0,02     5,28   0,000         1,06        1,14
            |
       sexo |
         1  |       1,00  (base)
         2  |       1,96       0,56     2,36   0,018         1,12        3,44
            |
     estud4 |
         0  |       1,00  (base)
         1  |       0,87       0,29    -0,42   0,676         0,45        1,69
         2  |       0,88       0,40    -0,28   0,781         0,36        2,14
         3  |       0,52       0,25    -1,37   0,170         0,21        1,32
            |
  difinmes3 |
         0  |       1,00  (base)
         1  |       1,89       0,61     1,97   0,049         1,00        3,57
         2  |       3,84       1,39     3,70   0,000         1,88        7,83
            |
      _cons |       0,00       0,00    -7,01   0,000         0,00        0,00
------------------------------------------------------------------------------

. estat gof

Logistic model for discAVD, goodness-of-fit test

      number of observations =       710
number of covariate patterns =       350
           Pearson chi2(342) =       328,89
                 Prob > chi2 =         0,6852


The last two models don't look terribly different, so what is the
reason for a such a large change in the Hosmer&Lemeshow result? Which
one should I trust?

Thank you for your time and attention.

Angel Rodriguez-Laso
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index