Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?


From   "Paul Walsh" <yousentwhohome@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?
Date   Sat, 27 Oct 2007 15:53:36 -0700

Thanks Carlo, I ll chew it over

Paul

On 10/27/07, Carlo Lazzaro <carlo.lazzaro@tin.it> wrote:
>
> Dear Paul,
>
> provided that I have figured out correctly your research need, as a
> sensitivity analysis of your base case results on effectiveness, you might
> find useful to perform a permutation test (see - help permute - ) on the two
> samples of patient you are comparing.
>
> As you are surely aware of, the theorical hypotheses of this random
> resampling without reintroduction test are well reported in:
>
> Efron B, Tibshirani JT. An Introduction to the Bootstrap. New York:
> Chapman&Hall 1993: 202-219 (particularly).
>
> Sorry I cannot be more helpful.
>
> Kind Regards,
>
> Carlo
> -----Messaggio originale-----
> Da: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Paul Walsh
> Inviato: sabato 27 ottobre 2007 17.47
> A: statalist@hsphsun2.harvard.edu
> Oggetto: st: Can I repeatedly sample with constraints from an unbalanced
> data set to balance it?
>
> I have a 700 subject data set of a clinical trial comparing two
> treatments for outcome (hospital admission from an emergency room) for
> a particular disease.  I am also using a three point ordinal scale of
> disease severity that strongly predicts hospital admission regardless
> of treatment.  Though the trial was designed to balance the two
> treatment arms, calculating disease severity is too cumbersome to have
> included it to balance each treatment arm with equal numbers of each
> severity category in the emergency room.  Thus there is an unequal
> distribution of severity of cases in the two arms.  When I calculate
> the unadjusted risk ratio of admission for two treatments I obtain a
> low, non-significant crude RR, similar to already published studies
> that did not account for severity.  When I model the treatments and
> include the severity score, the adjusted RR increases and is
> significant, demonstrating superiority of one treatment over the
> other.
>
>
>
> The manuscript reviewers feel that the study should have balanced the
> severity scores in both treatment arms instead of including severity
> as a variable.  I'd like to run jackknife  or bootstrap estimations of
> unadjusted RR by constraining each jackknife/bootstrap to select equal
> numbers of patients receiving each treatment with each severity score.
>  The goal is to repeatedly select samples from the data set that
> produce equal numbers of patients in each of the six groups (two
> treatments, three severity classifications). Can someone comment on
> the feasibility of doing this in the bootstrap/jack knife context?
> Since this is not random sampling from the data set, how would this
> procedure affect the interpretation of bootstrapped/jacknifed results?
>  If feasible and interpretable, can someone suggest some code  that
> would do this?
>
> I have a 700 subject data set of a clinical trial comparing two
> treatments for outcome (hospital admission from an emergency room) for
> a particular disease. I am also using a three point ordinal scale of
> disease severity that strongly predicts hospital admission regardless
> of treatment. Though the trial was designed to balance the two
> treatment arms, calculating disease severity is too cumbersome to have
> included it to balance each treatment arm with equal numbers of each
> severity category in the emergency room. Thus there is an unequal
> distribution of severity of cases in the two arms. When I calculate
> the unadjusted risk ratio of admission for two treatments I obtain a
> low, non-significant crude RR, similar to already published studies
> that did not account for severity. When I model the treatments and
> include the severity score, the adjusted RR increases and is
> significant, demonstrating superiority of one treatment over the
> other.
>
> The manuscript reviewers feel that the study should have balanced the
> severity scores in both treatment arms instead of including severity
> as a variable. I'd like to run jackknife or bootstrap estimations of
> unadjusted RR by constraining each jackknife/bootstrap to select equal
> numbers of patients receiving each treatment with each severity score.
> The goal is to repeatedly select samples from the data set that
> produce equal numbers of patients in each of the six groups (two
> treatments, three severity classifications). Can someone comment on
> the feasibility of doing this in the bootstrap/jack knife context?
> Since this is not random sampling from the data set, how would this
> procedure affect the interpretation of bootstrapped/jacknifed results?
> If feasible and interpretable, can someone suggest some code that
> would do this or suggest another way of achieving the same goals?
>
>  Paul Walsh
>
> Bakersfield CA
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index