[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: Analysis of binary cluster data with missing values

From	Joseph Coveney <[email protected]>
To	Statalist <[email protected]>
Subject	Re: st: Re: Analysis of binary cluster data with missing values
Date	Fri, 14 Nov 2003 01:22:33 +0900

Louise Linsell wrote (excerpted):

--------------------------------------------------------------------------------

I have a dataset that looks something like this - each person has
observations on 2 hips and 2 knees and I wish to calcuate the odds ratio
of hips to knees, taking account of the clustering within person using
robust standard errors.

id    hip    knee  row   
1     1        0       1      
1     1        0       2           (1=replaced, 2=not replaced)
2     .         1       3      
2     0        1       4      
3     1        0       5     
3     0        .       6      
4     1        0       7     
4     .        .        8     

I have tried all of the following commands 

logistic hip knee, cluster(id)
xtlogit hip knee, i(id) or
svymean hip, by(knee)

but all of them drop any record (line) with a missing value (.), so in
the above dataset, rows 3, 6 and 8 would be dropped, despite there being
a non-missing observation in the other group that should be included in
the analysis. . . .

Any ideas or suggestions on how to calculate the correct odds ratio and
robust s.e. would be much appreciated.

--------------------------------------------------------------------------------

Louise mentions in a later posting that there is no left- or right-side pairing 
of the joints in a record in the dataset, so the joints are only clustered 
within patient and can really be represented as Joint A (hips) and Joint B 
(knees) for a patient.  To accommodate this, Louise might wish to consider 
reshaping long, calling the hip "joint 0" and the knee "joint 1" (or vice 
versa) to obtain the odds ratio for replacement (odds of replacement of Joint 1 
to those of Joint 0) using random-effects logistic regression.  This method 
will make full use of available data.  The do-file below illustrates this with 
an artificial dataset designed to mimic Louise's.  Much of the do-file is just 
to create the fictitious dataset for illustration.

Joseph Coveney

--------------------------------------------------------------------------------

clear
set obs 2000
set seed 20031113
generate int pid = mod(_n, 2)
replace pid = sum(pid)
generate float order = uniform()
sort order                         // No particular correlation
generate byte hip = _n <= 400
replace order = uniform()
sort order
drop order
generate knee = _n <= 600
replace hip = . if uniform() > 0.95 // 5% MCAR for both joints
replace knee = . if uniform() > 0.95
// Louise can begin here
sort pid
by pid: generate byte side = _n
rename hip replaced0
rename knee replaced1
reshape long replaced, i(pid side) j(joint)
drop side
xtlogit replaced joint, i(pid) re or
exit

--------------------------------------------------------------------------------




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Re: Removing the limit to 31 variables from stata -impute- ado
Next by Date: st: RE: RE: best way to create iweight - code?
Previous by thread: st: RE: Re: Analysis of binary cluster data with missing values
Next by thread: st: Re: Analysis of binary cluster data with missing values
Index(es):
- Date
- Thread