Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Groups (qantiles) in Hosmer-Lemeshows goodness of fit test


From   VISINTAINER PAUL <VISINT@NYMC.EDU>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Groups (qantiles) in Hosmer-Lemeshows goodness of fit test
Date   Thu, 13 Jun 2002 12:21:14 -0400

You can do this easily.  Run your model

.logistic dead crib
.predict phat     ----> this gives you the probability of "dead"
.predict xb, xb   ----> this is the linear predictor, e.g., the solution of 
                        your regression equation
.sort phat        ----> this sorts the dataset from lowest to highest phat
		            values, to correspond with the H-L table

.list crib xb phat  

The H-L table divides the dataset into percentiles of pr(dead) from lowest
to highest.  The -list- command will associate each "crib" value with its
linear predictor, xb and its predicted probability, phat.  The H-L cutpoints
will correspond to the phat listing.

paul



Paul F. Visintainer, PhD
School of Public Health
New York Medical College
Valhalla, NY  10595
(914) 594-4804  (phone)
(914) 594-4292  (fax)


-----Original Message-----
From: Kaaresen Per Ivar [mailto:per.ivar.kaaresen@rito.no]
Sent: Wednesday, June 12, 2002 9:08 AM
To: 'statalist@hsphsun2.harvard.edu'
Subject: st: Groups (qantiles) in Hosmer-Lemeshows goodness of fit test



Ikke sensitiv - ignore message - due to in-house security

Dear friends

Could anybody help me with this (again) probably rather simple problem.

I'm working with this CRIB (clinical risk index for babies) ability to
predict hospital death in a population (not individual) level.  To test the
calibration of the score I've done a logistic regression and then a
Hosmer-Lemesow goodness of fit test. The variables are dead and crib (range
0-21, integers only).

. logit dead crib

Iteration 0:   log likelihood = -206.16559
Iteration 1:   log likelihood = -155.91351
Iteration 2:   log likelihood = -150.13801
Iteration 3:   log likelihood = -150.00793
Iteration 4:   log likelihood = -150.00765

Logit estimates                                   Number of obs   =
443
                                                  LR chi2(1)      =
112.32
                                                  Prob > chi2     =
0.0000
Log likelihood = -150.00765                       Pseudo R2       =
0.2724

----------------------------------------------------------------------------
--
        dead |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
        crib |   .3516896   .0414307     8.49   0.000     .2704869
.4328923
       _cons |  -4.086216   .3701967   -11.04   0.000    -4.811788
-3.360644
----------------------------------------------------------------------------
--

. lfit,group(10) table

Logistic model for dead, goodness-of-fit test
(Table collapsed on quantiles of estimated probabilities)

note: because of ties, there are only 9 distinct quantiles

_Group     _Prob     _Obs_1     _Exp_1     _Obs_0     _Exp_0     _Total
     1    0.0233          2        1.8         74       74.2         76
     3    0.0328          2        2.2         65       64.8         67
     4    0.0642          3        3.7         60       59.3         63
     5    0.0888          2        3.2         34       32.8         36
     6    0.1217          4        2.9         20       21.1         24
     7    0.2188         15       13.7         56       57.3         71
     8    0.2848         10        9.7         24       24.3         34
     9    0.4458         12       12.4         20       19.6         32
    10    0.9644         28       28.4         12       11.6         40

       number of observations =       443
             number of groups =         9
      Hosmer-Lemeshow chi2(7) =         1.34
                  Prob > chi2 =         0.9874



My question is (yes, I see the possible problem with small numbers in 5 of
the cells - but that's not the question now): How can I find the intervals
in CRIB score which the different _Group (qantiles) represent? I would like
to present the results something like:


  CRIB     _Prob     _Obs_1     _Exp_1     _Obs_0     _Exp_0     _Total
   0-1    0.0233          2        1.8         74       74.2         76
   2-4    0.0328          2        2.2         65       64.8         67
   5-7    0.0642          3        3.7         60       59.3         63
     .    0.0888          2        3.2         34       32.8         36
     .    0.1217          4        2.9         20       21.1         24
     .    0.2188         15       13.7         56       57.3         71
     .    0.2848         10        9.7         24       24.3         34
     .    0.4458         12       12.4         20       19.6         32
     .    0.9644         28       28.4         12       11.6         40


I've seen this kind of presentation in different papers - and would be very
thankful if somebody could explain how I can do this in Stata.


Regards

Per Ivar Kaaresen
MD
University Hospital Nothern Norway




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index