Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: selecting random matched controls: survival


From   Michael McCulloch <[email protected]>
To   [email protected]
Subject   Re: st: selecting random matched controls: survival
Date   Fri, 28 Dec 2007 09:22:03 -0800

Hello Austin, I really appreciate your looking into my question. Yes, the data are correct: for each patient (tx==1), I want to make sure that I select only controls (tx==0) who've survived at least as long as the leadtime for that patient.

And yes, I want to select without replacement (i.e. each control has only one chance to be selected, and cannot simultaneously be matched with more than one patient).

Thanks for the suggestion of the random sort. I'm unclear about how to select the first N matching cases; that's the argument I'd appreciate some help with. Since the dataset has hundreds of records, I'd like to be able to implement this as an argument which I state once and then the routine does the matching for all patients, sampling without replacement from the pool of controls.

Michael



Michael--
I don't understand your example data--is that correctly stated?  Do
you want to select with replacement or without? The general problem of
selecting for each observation a random subset of N other cases that
satisfy some condition is fairly easy, though, if you generate a
random number u, sort by u, then select the first N matching cases.

On Dec 24, 2007 2:11 PM, Michael McCulloch <[email protected]> wrote:
 Dear Statalist members,

 I would like to ask help in a routine for selecting random matched
 controls in a survival dataset.

 I have patients (tx=1) and controls (tx=0), and have _stset_.
 Time from diagnosis to death = _t.
 In the patients, I also know the time between diagnosis and treatment
 (leadtime).

 I wish to randomly select two controls per patient, from among all
 controls who have survived at least as long as the leadtime for that
 patient. For example, subject ID==1 has eligible controls:
         ID==12 | ID==13 | ID==14
 Therefore, I'd like to randomly select two controls from among those
 three eligible controls.

 ID      tx      _t      leadtime
 1       1       1.0     0.4
 2       1       1.2     0.3
 11      0       0.1
 12      0       0.9
 13      0       1.1
 14      0       0.5

 May I ask how this can be done in a do-file or small program?

 Happy Holidays to all,
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index