[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Carlo Lazzaro" <carlo.lazzaro@tin.it> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it? |

Date |
Sat, 27 Oct 2007 18:35:08 +0200 |

Dear Paul, provided that I have figured out correctly your research need, as a sensitivity analysis of your base case results on effectiveness, you might find useful to perform a permutation test (see - help permute - ) on the two samples of patient you are comparing. As you are surely aware of, the theorical hypotheses of this random resampling without reintroduction test are well reported in: Efron B, Tibshirani JT. An Introduction to the Bootstrap. New York: Chapman&Hall 1993: 202-219 (particularly). Sorry I cannot be more helpful. Kind Regards, Carlo -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Paul Walsh Inviato: sabato 27 ottobre 2007 17.47 A: statalist@hsphsun2.harvard.edu Oggetto: st: Can I repeatedly sample with constraints from an unbalanced data set to balance it? I have a 700 subject data set of a clinical trial comparing two treatments for outcome (hospital admission from an emergency room) for a particular disease. I am also using a three point ordinal scale of disease severity that strongly predicts hospital admission regardless of treatment. Though the trial was designed to balance the two treatment arms, calculating disease severity is too cumbersome to have included it to balance each treatment arm with equal numbers of each severity category in the emergency room. Thus there is an unequal distribution of severity of cases in the two arms. When I calculate the unadjusted risk ratio of admission for two treatments I obtain a low, non-significant crude RR, similar to already published studies that did not account for severity. When I model the treatments and include the severity score, the adjusted RR increases and is significant, demonstrating superiority of one treatment over the other. The manuscript reviewers feel that the study should have balanced the severity scores in both treatment arms instead of including severity as a variable. I'd like to run jackknife or bootstrap estimations of unadjusted RR by constraining each jackknife/bootstrap to select equal numbers of patients receiving each treatment with each severity score. The goal is to repeatedly select samples from the data set that produce equal numbers of patients in each of the six groups (two treatments, three severity classifications). Can someone comment on the feasibility of doing this in the bootstrap/jack knife context? Since this is not random sampling from the data set, how would this procedure affect the interpretation of bootstrapped/jacknifed results? If feasible and interpretable, can someone suggest some code that would do this? I have a 700 subject data set of a clinical trial comparing two treatments for outcome (hospital admission from an emergency room) for a particular disease. I am also using a three point ordinal scale of disease severity that strongly predicts hospital admission regardless of treatment. Though the trial was designed to balance the two treatment arms, calculating disease severity is too cumbersome to have included it to balance each treatment arm with equal numbers of each severity category in the emergency room. Thus there is an unequal distribution of severity of cases in the two arms. When I calculate the unadjusted risk ratio of admission for two treatments I obtain a low, non-significant crude RR, similar to already published studies that did not account for severity. When I model the treatments and include the severity score, the adjusted RR increases and is significant, demonstrating superiority of one treatment over the other. The manuscript reviewers feel that the study should have balanced the severity scores in both treatment arms instead of including severity as a variable. I'd like to run jackknife or bootstrap estimations of unadjusted RR by constraining each jackknife/bootstrap to select equal numbers of patients receiving each treatment with each severity score. The goal is to repeatedly select samples from the data set that produce equal numbers of patients in each of the six groups (two treatments, three severity classifications). Can someone comment on the feasibility of doing this in the bootstrap/jack knife context? Since this is not random sampling from the data set, how would this procedure affect the interpretation of bootstrapped/jacknifed results? If feasible and interpretable, can someone suggest some code that would do this or suggest another way of achieving the same goals? Paul Walsh Bakersfield CA * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?***From:*"Paul Walsh" <yousentwhohome@gmail.com>

**References**:**st: Can I repeatedly sample with constraints from an unbalanced data set to balance it?***From:*"Paul Walsh" <yousentwhohome@gmail.com>

- Prev by Date:
**st: Can I repeatedly sample with constraints from an unbalanced data set to balance it?** - Next by Date:
**st: stacking horizontal data** - Previous by thread:
**st: Can I repeatedly sample with constraints from an unbalanced data set to balance it?** - Next by thread:
**Re: st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |