Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: estat gof (Hosmer & Lemeshow) after svy:logistic (survey)


From   Ángel Rodríguez Laso <[email protected]>
To   [email protected]
Subject   st: estat gof (Hosmer & Lemeshow) after svy:logistic (survey)
Date   Wed, 17 Jul 2013 12:23:41 +0200

Dear Statalisters,

Working with Stata 12.1.


If I carry out the following logistic regression in a survey setting
and then type estat gof I get:


. svy, subpop(if disdesjub==1 & disdestr==1 & trab==1 & dismy50==1 &
proxy==2 & edad_c>=60): logistic discAVD edad_c i.sexo i. estud4
i.difinmes3
(running logistic on estimation sample)

Survey: Logistic regression

Number of strata   =        41                  Number of obs      =      1727
Number of PSUs     =       234                  Population size    = 1347,0862
                                                Subpop. no. of obs =       710
                                                Subpop. size       =    563,75
                                                Design df          =       193
                                                F(   7,    187)    =      8,32
                                                Prob > F           =    0,0000

------------------------------------------------------------------------------
             |             Linearized
     discAVD | Odds Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      edad_c |       1,10       0,02     4,42   0,000         1,05        1,15
             |
        sexo |
          1  |       1,00  (base)
          2  |       2,60       0,82     3,02   0,003         1,39        4,84
             |
      estud4 |
          0  |       1,00  (base)
          1  |       0,87       0,32    -0,38   0,704         0,43        1,78
          2  |       0,90       0,40    -0,24   0,807         0,37        2,16
          3  |       0,60       0,27    -1,14   0,257         0,24        1,47
             |
   difinmes3 |
          0  |       1,00  (base)
          1  |       1,59       0,57     1,31   0,190         0,79        3,21
          2  |       3,33       1,20     3,35   0,001         1,64        6,77
             |
       _cons |       0,00       0,00    -5,88   0,000         0,00        0,00
------------------------------------------------------------------------------

.
end of do-file

. estat gof
estat gof is not allowed after subpopulation estimations
r(198);



Then I change if statements for my subpopulation especifications:


. svy: logistic discAVD edad_c i.sexo i.estud4 i.difinmes3 if
disdesjub==1 & disdestr==1 & trab==1 & dismy50==1 & proxy==2 &
edad_c>=60
(running logistic on estimation sample)

Survey: Logistic regression

Number of strata   =        41                  Number of obs      =       710
Number of PSUs     =       193                  Population size    =    563,75
                                                Design df          =       152
                                                F(   7,    146)    =      8,35
                                                Prob > F           =    0,0000

------------------------------------------------------------------------------
             |             Linearized
     discAVD | Odds Ratio   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      edad_c |       1,10       0,02     4,41   0,000         1,05        1,15
             |
        sexo |
          1  |       1,00  (base)
          2  |       2,60       0,82     3,02   0,003         1,39        4,85
             |
      estud4 |
          0  |       1,00  (base)
          1  |       0,87       0,32    -0,38   0,707         0,42        1,79
          2  |       0,90       0,40    -0,25   0,807         0,37        2,16
          3  |       0,60       0,27    -1,15   0,254         0,24        1,46
             |
   difinmes3 |
          0  |       1,00  (base)
          1  |       1,59       0,56     1,32   0,189         0,79        3,21
          2  |       3,33       1,18     3,39   0,001         1,65        6,72
             |
       _cons |       0,00       0,00    -5,88   0,000         0,00        0,00
------------------------------------------------------------------------------

. estat gof

Logistic model for discAVD, goodness-of-fit test

                     F(9,144) =       110,29
                     Prob > F =         0,0000



But if I get rid of the survey especifications, I get:

. logistic discAVD edad_c i.sexo i.estud4 i.difinmes3 if disdesjub==1
& disdestr==1 & trab==1 & dismy50==1 & proxy==2 & edad_c>=60

Logistic regression                               Number of obs   =        710
                                                  LR chi2(7)      =      65,87
                                                  Prob > chi2     =     0,0000
Log likelihood = -210,78135                       Pseudo R2       =     0,1351

------------------------------------------------------------------------------
     discAVD | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      edad_c |       1,10       0,02     5,28   0,000         1,06        1,14
             |
        sexo |
          1  |       1,00  (base)
          2  |       1,96       0,56     2,36   0,018         1,12        3,44
             |
      estud4 |
          0  |       1,00  (base)
          1  |       0,87       0,29    -0,42   0,676         0,45        1,69
          2  |       0,88       0,40    -0,28   0,781         0,36        2,14
          3  |       0,52       0,25    -1,37   0,170         0,21        1,32
             |
   difinmes3 |
          0  |       1,00  (base)
          1  |       1,89       0,61     1,97   0,049         1,00        3,57
          2  |       3,84       1,39     3,70   0,000         1,88        7,83
             |
       _cons |       0,00       0,00    -7,01   0,000         0,00        0,00
------------------------------------------------------------------------------

. estat gof

Logistic model for discAVD, goodness-of-fit test

       number of observations =       710
 number of covariate patterns =       350
            Pearson chi2(342) =       328,89
                  Prob > chi2 =         0,6852


The last two models don't look terribly different, so what is the
reason for a such a large change in the Hosmer&Lemeshow result? Which
one should I trust?

Thank you for your time and attention.

Angel Rodriguez-Laso
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index