# st: RE: Groups (qantiles) in Hosmer-Lemeshows goodness of fit test

 From VISINTAINER PAUL To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: Groups (qantiles) in Hosmer-Lemeshows goodness of fit test Date Thu, 13 Jun 2002 12:21:14 -0400

```You can do this easily.  Run your model

.predict phat     ----> this gives you the probability of "dead"
.predict xb, xb   ----> this is the linear predictor, e.g., the solution of
.sort phat        ----> this sorts the dataset from lowest to highest phat
values, to correspond with the H-L table

.list crib xb phat

The H-L table divides the dataset into percentiles of pr(dead) from lowest
to highest.  The -list- command will associate each "crib" value with its
linear predictor, xb and its predicted probability, phat.  The H-L cutpoints
will correspond to the phat listing.

paul

Paul F. Visintainer, PhD
School of Public Health
New York Medical College
Valhalla, NY  10595
(914) 594-4804  (phone)
(914) 594-4292  (fax)

-----Original Message-----
From: Kaaresen Per Ivar [mailto:per.ivar.kaaresen@rito.no]
Sent: Wednesday, June 12, 2002 9:08 AM
To: 'statalist@hsphsun2.harvard.edu'
Subject: st: Groups (qantiles) in Hosmer-Lemeshows goodness of fit test

Ikke sensitiv - ignore message - due to in-house security

Dear friends

Could anybody help me with this (again) probably rather simple problem.

I'm working with this CRIB (clinical risk index for babies) ability to
predict hospital death in a population (not individual) level.  To test the
calibration of the score I've done a logistic regression and then a
Hosmer-Lemesow goodness of fit test. The variables are dead and crib (range
0-21, integers only).

. logit dead crib

Iteration 0:   log likelihood = -206.16559
Iteration 1:   log likelihood = -155.91351
Iteration 2:   log likelihood = -150.13801
Iteration 3:   log likelihood = -150.00793
Iteration 4:   log likelihood = -150.00765

Logit estimates                                   Number of obs   =
443
LR chi2(1)      =
112.32
Prob > chi2     =
0.0000
Log likelihood = -150.00765                       Pseudo R2       =
0.2724

----------------------------------------------------------------------------
--
dead |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
crib |   .3516896   .0414307     8.49   0.000     .2704869
.4328923
_cons |  -4.086216   .3701967   -11.04   0.000    -4.811788
-3.360644
----------------------------------------------------------------------------
--

. lfit,group(10) table

Logistic model for dead, goodness-of-fit test
(Table collapsed on quantiles of estimated probabilities)

note: because of ties, there are only 9 distinct quantiles

_Group     _Prob     _Obs_1     _Exp_1     _Obs_0     _Exp_0     _Total
1    0.0233          2        1.8         74       74.2         76
3    0.0328          2        2.2         65       64.8         67
4    0.0642          3        3.7         60       59.3         63
5    0.0888          2        3.2         34       32.8         36
6    0.1217          4        2.9         20       21.1         24
7    0.2188         15       13.7         56       57.3         71
8    0.2848         10        9.7         24       24.3         34
9    0.4458         12       12.4         20       19.6         32
10    0.9644         28       28.4         12       11.6         40

number of observations =       443
number of groups =         9
Hosmer-Lemeshow chi2(7) =         1.34
Prob > chi2 =         0.9874

My question is (yes, I see the possible problem with small numbers in 5 of
the cells - but that's not the question now): How can I find the intervals
in CRIB score which the different _Group (qantiles) represent? I would like
to present the results something like:

CRIB     _Prob     _Obs_1     _Exp_1     _Obs_0     _Exp_0     _Total
0-1    0.0233          2        1.8         74       74.2         76
2-4    0.0328          2        2.2         65       64.8         67
5-7    0.0642          3        3.7         60       59.3         63
.    0.0888          2        3.2         34       32.8         36
.    0.1217          4        2.9         20       21.1         24
.    0.2188         15       13.7         56       57.3         71
.    0.2848         10        9.7         24       24.3         34
.    0.4458         12       12.4         20       19.6         32
.    0.9644         28       28.4         12       11.6         40

I've seen this kind of presentation in different papers - and would be very
thankful if somebody could explain how I can do this in Stata.

Regards

Per Ivar Kaaresen
MD
University Hospital Nothern Norway

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```