[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: Re: Analysis of binary cluster data with missingvalues |

Date |
Thu, 13 Nov 2003 13:30:31 -0000 |

There may be good design reasons for this data structure, but to me the information here makes the dataset seem even more refractory. I assumed incorrectly that a missing value in one observation still meant that one limb could be used in analysis. Note also that even though you know that hip and knee are not paired, that is how your data structure appears to Stata. In other words, let's take another person, with one hip and one knee replaced. That could be represented, if I understand correctly, as 5 1 0 5 0 1 or as 5 1 1 5 0 0 so that those different representations of the data would end up summarised in different cells of a 2 X 2 table, leading to quite different odds ratios. As far as Stata is concerned, these are different data. Finally, doesn't the whole idea of a cluster depend on being able to think of individuals with their own identity, as say with individual people within a family? What's the identity of one arbitrary hip and one arbitrary knee? Not that I have any insights here beyond using a pair of legs. Nick n.j.cox@durham.ac.uk Louise Linsell > To fill you in further, the problem is this: > > I have a sample of 1000 people, that is, 2000 hips and 2000 > knees. A > proportion of each are replaced, 400 replaced hips and 600 replaced > knees. > I wish to test for the difference between the proportion of replaced > hips and replaced knees, or calculate an odds ratio with a > confidence > interval, using a method that deals with the lack of independence > between joints. Each row of the dataset represents a hip > and a knee for > each person, but they are not paired in any order, i.e. > within person, > the right hip could appear on the same line as the left knee or the > right knee, and this is why it matters that STATA drops the > whole line > of observations when it drops a missing value in the analysis. > > >>> n.j.cox@durham.ac.uk 13/11/2003 11:39:07 >>> > I assume "row" means "leg" or "limb". > > I don't think this is a Stata issue, > as Stata's behaviour appears correct > given the data and what you ask of it. > > Please explain why you think your > by-hand analysis is correct. I > don't think you can collapse the > table in this way. A test of > that is that if you reverse the > table you cannot recover even a subset > of the data correctly, as you do not have > complete data on 6 limbs. Rather, you have > complete data on 5 limbs only, > as Stata is telling you. > > Nick > n.j.cox@durham.ac.uk > > Louise Linsell > > > I have a dataset that looks something like this - each person has > > observations on 2 hips and 2 knees and I wish to calcuate > > the odds ratio > > of hips to knees, taking account of the clustering within > > person using > > robust standard errors. > > > > id hip knee row > > 1 1 0 1 > > 1 1 0 2 (1=replaced, 2=not replaced) > > 2 . 1 3 > > 2 0 1 4 > > 3 1 0 5 > > 3 0 . 6 > > 4 1 0 7 > > 4 . . 8 > > > > I have tried all of the following commands > > > > logistic hip knee, cluster(id) > > xtlogit hip knee, i(id) or > > svymean hip, by(knee) > > > > but all of them drop any record (line) with a missing value > > (.), so in > > the above dataset, rows 3, 6 and 8 would be dropped, > > despite there being > > a non-missing observation in the other group that should be > > included in > > the analysis. This gives incorrect estimates for the odds > > ratio when you > > calculate them by hand from the equivalent 2x2 table. > e.g from the > > above example, if I was calculating the OR by hand I would get > > > > yes (1) no (0) > > > > hips 4 2 > > > > knees 2 4 > > > > OR = (4X4)/(2X2) = 4 > > > > However if all the rows with the missing values are > dropped, you get > > > > yes (1) no (0) > > > > hips 4 1 > > > > knees 1 4 > > > > OR = (4X4)/(1X1) = 16 > > > > Any ideas or suggestions on how to calculate the correct > > odds ratio and > > robust s.e. would be much appreciated. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Probit estimation sing stata** - Next by Date:
**st: nonlinearleastsquare** - Previous by thread:
**st: Probit estimation sing stata** - Next by thread:
**st: nonlinearleastsquare** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |