Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: Analysis of binary cluster data with missing values

From   Joseph Coveney <>
To   Statalist <>
Subject   Re: st: Re: Analysis of binary cluster data with missing values
Date   Fri, 14 Nov 2003 01:22:33 +0900

Louise Linsell wrote (excerpted):


I have a dataset that looks something like this - each person has
observations on 2 hips and 2 knees and I wish to calcuate the odds ratio
of hips to knees, taking account of the clustering within person using
robust standard errors.

id    hip    knee  row   
1     1        0       1      
1     1        0       2           (1=replaced, 2=not replaced)
2     .         1       3      
2     0        1       4      
3     1        0       5     
3     0        .       6      
4     1        0       7     
4     .        .        8     

I have tried all of the following commands 

logistic hip knee, cluster(id)
xtlogit hip knee, i(id) or
svymean hip, by(knee)

but all of them drop any record (line) with a missing value (.), so in
the above dataset, rows 3, 6 and 8 would be dropped, despite there being
a non-missing observation in the other group that should be included in
the analysis. . . .

Any ideas or suggestions on how to calculate the correct odds ratio and
robust s.e. would be much appreciated.


Louise mentions in a later posting that there is no left- or right-side pairing 
of the joints in a record in the dataset, so the joints are only clustered 
within patient and can really be represented as Joint A (hips) and Joint B 
(knees) for a patient.  To accommodate this, Louise might wish to consider 
reshaping long, calling the hip "joint 0" and the knee "joint 1" (or vice 
versa) to obtain the odds ratio for replacement (odds of replacement of Joint 1 
to those of Joint 0) using random-effects logistic regression.  This method 
will make full use of available data.  The do-file below illustrates this with 
an artificial dataset designed to mimic Louise's.  Much of the do-file is just 
to create the fictitious dataset for illustration.

Joseph Coveney


set obs 2000
set seed 20031113
generate int pid = mod(_n, 2)
replace pid = sum(pid)
generate float order = uniform()
sort order                         // No particular correlation
generate byte hip = _n <= 400
replace order = uniform()
sort order
drop order
generate knee = _n <= 600
replace hip = . if uniform() > 0.95 // 5% MCAR for both joints
replace knee = . if uniform() > 0.95
// Louise can begin here
sort pid
by pid: generate byte side = _n
rename hip replaced0
rename knee replaced1
reshape long replaced, i(pid side) j(joint)
drop side
xtlogit replaced joint, i(pid) re or


*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index