# st: A little problem with clustered data

 From Ronán Conroy To "statalist hsphsun2.harvard.edu" Subject st: A little problem with clustered data Date Fri, 30 Apr 2004 12:01:45 +0100

```I would like to show you the results of two logistic regressions and get
ideas on how to carry the analysis forward.

The situation is this: a survey was run in 38 hospitals. The survey used two
depression scales. Half the hospitals received the first scale, the other
half the second.

Not all patients completed a depression scale. The researcher suspected that
one of the scales was less likely to be returned (it was more threatening
than the other). So he ran a logistic regression, which confirmed his
suspicions.

. logistic depress_scale_ok which_scale

Logistic regression                            Number of obs   =       1206
LR chi2(1)      =      13.09
Prob > chi2     =     0.0003
Log likelihood = -758.99238                    Pseudo R2       =     0.0086

---------------------------------------------------------------------------
depress_sc~k | Odds Ratio   Std. Err.      z    P>|z|  [95% Conf. Interval]
-------------+-------------------------------------------------------------
which_scale |   1.560516   .1927593     3.60   0.000   1.22497    1.987975
---------------------------------------------------------------------------

However, when he used -svyset- to set the PSU to hospital, to account for
patient clustering within hospitals, this is what happens (same point
estimate, but much wider confidence intervals)

. svylogit depress_scale_ok which_scale if which_scale <3, or

Survey logistic regression

pweight:  <none>                               Number of obs    =      1206
Strata:   <one>                                Number of strata =         1
PSU:      hospital_number                      Number of PSUs   =        38
Population size  =      1206
F(   1,     37)  =      3.15
Prob > F         =    0.0839

---------------------------------------------------------------------------
depress_sc~k | Odds Ratio   Std. Err.      t    P>|t|  [95% Conf. Interval]
-------------+-------------------------------------------------------------
which_scale |   1.560516    .390986     1.78   0.084  .9392774    2.592642
---------------------------------------------------------------------------

So it would seem that the variation between hospitals in the rate of return
is greater than the variation you would expect from a binomial process. This
accords with the researcher's experience. Some hospitals took a dislike to
the depression scales, and this was more likely to happen with the more
threatening one.

My question, finally, is what next? Clearly, one source of variability in
the return of completed depression scales is whether the hospital thinks
that it is a useful exercise or not. But are hospitals allocated the second
scale more likely to withhold their collaboration? How much of the poorer
return rate is the unwillingness of patients to fill in the scale, and how
much is the way in which the hospital handles the task of administering the
scale and making sure it is returned?

A good two-pipe problem, as my old metaphysics tutor used to say.

Ronan M Conroy (rconroy@rcsi.ie)
Lecturer in Biostatistics
Royal College of Surgeons
Dublin 2, Ireland
+353 1 402 2431 (fax 2764)

--------------------
Just say no to drug reps
http://www.nofreelunch.org/

--------------------------------------------------------------------------------------------------------------------
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom
originator of the message. This footer also confirms that this
email message has been scanned for the presence of computer viruses.

Any views expressed in this message are those of the individual
sender, except where the sender specifies and with authority,
states them to be the views of The Royal College Of Surgeons in Ireland.

--------------------------------------------------------------------------------------------------------------------
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```