[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Paul Walsh" <yousentwhohome@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Can I repeatedly sample with constraints from an unbalanced data set to balance it? |

Date |
Sat, 27 Oct 2007 08:47:29 -0700 |

I have a 700 subject data set of a clinical trial comparing two treatments for outcome (hospital admission from an emergency room) for a particular disease. I am also using a three point ordinal scale of disease severity that strongly predicts hospital admission regardless of treatment. Though the trial was designed to balance the two treatment arms, calculating disease severity is too cumbersome to have included it to balance each treatment arm with equal numbers of each severity category in the emergency room. Thus there is an unequal distribution of severity of cases in the two arms. When I calculate the unadjusted risk ratio of admission for two treatments I obtain a low, non-significant crude RR, similar to already published studies that did not account for severity. When I model the treatments and include the severity score, the adjusted RR increases and is significant, demonstrating superiority of one treatment over the other. The manuscript reviewers feel that the study should have balanced the severity scores in both treatment arms instead of including severity as a variable. I'd like to run jackknife or bootstrap estimations of unadjusted RR by constraining each jackknife/bootstrap to select equal numbers of patients receiving each treatment with each severity score. The goal is to repeatedly select samples from the data set that produce equal numbers of patients in each of the six groups (two treatments, three severity classifications). Can someone comment on the feasibility of doing this in the bootstrap/jack knife context? Since this is not random sampling from the data set, how would this procedure affect the interpretation of bootstrapped/jacknifed results? If feasible and interpretable, can someone suggest some code that would do this? I have a 700 subject data set of a clinical trial comparing two treatments for outcome (hospital admission from an emergency room) for a particular disease. I am also using a three point ordinal scale of disease severity that strongly predicts hospital admission regardless of treatment. Though the trial was designed to balance the two treatment arms, calculating disease severity is too cumbersome to have included it to balance each treatment arm with equal numbers of each severity category in the emergency room. Thus there is an unequal distribution of severity of cases in the two arms. When I calculate the unadjusted risk ratio of admission for two treatments I obtain a low, non-significant crude RR, similar to already published studies that did not account for severity. When I model the treatments and include the severity score, the adjusted RR increases and is significant, demonstrating superiority of one treatment over the other. The manuscript reviewers feel that the study should have balanced the severity scores in both treatment arms instead of including severity as a variable. I'd like to run jackknife or bootstrap estimations of unadjusted RR by constraining each jackknife/bootstrap to select equal numbers of patients receiving each treatment with each severity score. The goal is to repeatedly select samples from the data set that produce equal numbers of patients in each of the six groups (two treatments, three severity classifications). Can someone comment on the feasibility of doing this in the bootstrap/jack knife context? Since this is not random sampling from the data set, how would this procedure affect the interpretation of bootstrapped/jacknifed results? If feasible and interpretable, can someone suggest some code that would do this or suggest another way of achieving the same goals? Paul Walsh Bakersfield CA * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?***From:*"Carlo Lazzaro" <carlo.lazzaro@tin.it>

- Prev by Date:
**st: egen, cut and labels** - Next by Date:
**st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?** - Previous by thread:
**st: egen, cut and labels** - Next by thread:
**st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |