st: RE: Re: Analysis of binary cluster data with missingvalues

 From "Louise Linsell" To , Subject st: RE: Re: Analysis of binary cluster data with missingvalues Date Thu, 13 Nov 2003 12:18:33 +0000

To fill you in further, the problem is this:

I have a sample of 1000 people, that is, 2000 hips and 2000 knees.  A
proportion of each are replaced, 400 replaced hips and  600 replaced
knees.
I wish to test for the difference between the proportion of replaced
hips and replaced knees, or calculate an odds ratio with a confidence
interval, using a method that deals with the lack of independence
between joints.  Each row of the dataset represents a hip and a knee for
each person, but they are not paired in any order, i.e. within person,
the right hip could appear on the same line as the left knee or the
right knee, and this is why it matters that STATA drops the whole line
of observations when it drops a missing value in the analysis.

Louise

>>> n.j.cox@durham.ac.uk 13/11/2003 11:39:07 >>>
I assume "row" means "leg" or "limb".

I don't think this is a Stata issue,
as Stata's behaviour appears correct
given the data and what you ask of it.

Please explain why you think your
by-hand analysis is correct. I
don't think you can collapse the
table in this way. A test of
that is that if you reverse the
table you cannot recover even a subset
of the data correctly, as you do not have
complete data on 6 limbs. Rather, you have
complete data on 5 limbs only,
as Stata is telling you.

Nick
n.j.cox@durham.ac.uk

Louise Linsell

> I have a dataset that looks something like this - each person has
> observations on 2 hips and 2 knees and I wish to calcuate
> the odds ratio
> of hips to knees, taking account of the clustering within
> person using
> robust standard errors.
>
> id    hip    knee  row
> 1     1        0       1
> 1     1        0       2           (1=replaced, 2=not replaced)
> 2     .         1       3
> 2     0        1       4
> 3     1        0       5
> 3     0        .       6
> 4     1        0       7
> 4     .        .        8
>
> I have tried all of the following commands
>
> logistic hip knee, cluster(id)
> xtlogit hip knee, i(id) or
> svymean hip, by(knee)
>
> but all of them drop any record (line) with a missing value
> (.), so in
> the above dataset, rows 3, 6 and 8 would be dropped,
> despite there being
> a non-missing observation in the other group that should be
> included in
> the analysis. This gives incorrect estimates for the odds
> ratio when you
> calculate them by hand from the equivalent 2x2 table.  e.g from the
> above example, if I was calculating the OR by hand I would get
>
>                  yes (1)     no (0)
>
> hips            4              2
>
> knees         2              4
>
> OR = (4X4)/(2X2) = 4
>
> However if all the rows with the missing values are dropped, you get
>
>                 yes (1)     no (0)
>
> hips            4              1
>
> knees         1              4
>
> OR = (4X4)/(1X1) = 16
>
> Any ideas or suggestions on how to calculate the correct
> odds ratio and
> robust s.e. would be much appreciated.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/