[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Re: Analysis of binary cluster data with missingvalues

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: RE: Re: Analysis of binary cluster data with missingvalues
Date	Thu, 13 Nov 2003 13:30:31 -0000

There may be good design reasons for
this data structure, but to me the information
here makes the dataset seem
even more refractory. I assumed incorrectly
that a missing value in one observation still
meant that one limb could be used in
analysis.

Note also that even though you know that
hip and knee are not paired, that is how
your data structure appears to Stata.
In other words, let's take another person,
with one hip and one knee replaced. That
could be represented, if I understand
correctly, as

5   1  0
5   0  1

or as

5   1   1
5   0   0

so that those different representations of
the data would end up summarised in different
cells of a 2 X 2 table, leading to quite
different odds ratios. As far as Stata is
concerned, these are different data.

Finally, doesn't the whole idea of a cluster
depend on being able to think of individuals
with their own identity, as say with
individual people within a family? What's
the identity of one arbitrary hip and one
arbitrary knee? Not that I have any insights
here beyond using a pair of legs.

Nick
[email protected]

Louise Linsell

> To fill you in further, the problem is this:
>
> I have a sample of 1000 people, that is, 2000 hips and 2000
> knees.  A
> proportion of each are replaced, 400 replaced hips and  600 replaced
> knees.
> I wish to test for the difference between the proportion of replaced
> hips and replaced knees, or calculate an odds ratio with a
> confidence
> interval, using a method that deals with the lack of independence
> between joints.  Each row of the dataset represents a hip
> and a knee for
> each person, but they are not paired in any order, i.e.
> within person,
> the right hip could appear on the same line as the left knee or the
> right knee, and this is why it matters that STATA drops the
> whole line
> of observations when it drops a missing value in the analysis.
>
> >>> [email protected] 13/11/2003 11:39:07 >>>
> I assume "row" means "leg" or "limb".
>
> I don't think this is a Stata issue,
> as Stata's behaviour appears correct
> given the data and what you ask of it.
>
> Please explain why you think your
> by-hand analysis is correct. I
> don't think you can collapse the
> table in this way. A test of
> that is that if you reverse the
> table you cannot recover even a subset
> of the data correctly, as you do not have
> complete data on 6 limbs. Rather, you have
> complete data on 5 limbs only,
> as Stata is telling you.
>
> Nick
> [email protected]
>
> Louise Linsell
>
> > I have a dataset that looks something like this - each person has
> > observations on 2 hips and 2 knees and I wish to calcuate
> > the odds ratio
> > of hips to knees, taking account of the clustering within
> > person using
> > robust standard errors.
> >
> > id    hip    knee  row
> > 1     1        0       1
> > 1     1        0       2           (1=replaced, 2=not replaced)
> > 2     .         1       3
> > 2     0        1       4
> > 3     1        0       5
> > 3     0        .       6
> > 4     1        0       7
> > 4     .        .        8
> >
> > I have tried all of the following commands
> >
> > logistic hip knee, cluster(id)
> > xtlogit hip knee, i(id) or
> > svymean hip, by(knee)
> >
> > but all of them drop any record (line) with a missing value
> > (.), so in
> > the above dataset, rows 3, 6 and 8 would be dropped,
> > despite there being
> > a non-missing observation in the other group that should be
> > included in
> > the analysis. This gives incorrect estimates for the odds
> > ratio when you
> > calculate them by hand from the equivalent 2x2 table.
> e.g from the
> > above example, if I was calculating the OR by hand I would get
> >
> >                  yes (1)     no (0)
> >
> > hips            4              2
> >
> > knees         2              4
> >
> > OR = (4X4)/(2X2) = 4
> >
> > However if all the rows with the missing values are
> dropped, you get
> >
> >                 yes (1)     no (0)
> >
> > hips            4              1
> >
> > knees         1              4
> >
> > OR = (4X4)/(1X1) = 16
> >
> > Any ideas or suggestions on how to calculate the correct
> > odds ratio and
> > robust s.e. would be much appreciated.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Probit estimation sing stata
Next by Date: st: nonlinearleastsquare
Previous by thread: st: Probit estimation sing stata
Next by thread: st: nonlinearleastsquare
Index(es):
- Date
- Thread