[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: selecting random matched controls: survival |

Date |
Fri, 28 Dec 2007 15:29:36 -0500 |

Michael-- As I understand, the problem is to randomly select without replacement two "control" obs with tx==0 to match each "treatment" obs where tx==1 subject to the constraint that _t for each "control" obs is >= leadtime for the "treatment" obs. In this case, the conceptually simplest approach is to loop over each treatment case in decreasing order of leadtime (since the highest values for leadtime also have the smallest sets of possible matches, if I understand you correctly). Let me modify ID 11 to have _t==.31 so the cases you gave can be easily matched: set seed 12345 clear input ID tx _t leadtime 1 1 1.0 0.4 2 1 1.2 0.3 11 0 .31 . 12 0 0.9 . 13 0 1.1 . 14 0 0.5 . end gsort -tx -leadtime g obs=_n qui levelsof obs if tx==1, loc(is) g match=. qui foreach i of local is { sort obs local m=ID[`i'] local lt=leadtime[`i'] g u=uniform() if mi(match) & tx==0 & _t>=`lt' sort u replace match=`m' in 1/2 drop u } li There is almost certainly a more efficient way to do this, perhaps using Mata, and possibly some of Ben Jann's contributions, but the above is simple enough to be understood easily. The idea is to pick each "treatment" obs in turn, sort the possible matches randomly, and allocate the first two (or put in another number instead of 2 to pick more matches) possible matches to the "treatment" obs whose turn it is (by assigning its ID to both those control obs in the "match" variable). What you need to do downstream from here may necessitate some other processing, but the general approach can be modified to suit many purposes. If you run into trouble with this approach, a likely culprit is that a condition like _t>=`lead' can fail to be satisfied even when it looks like it should be (when both are .3 to all appearances, for example; see various FAQs e.g. http://www.stata.com/support/faqs/data/float.html) and a bunch of missings are generated for your u variable. This kind of thing can be hard to track down, but you can set trace on and remove the -qui- qualifiers to see what is going on inside the loop. You can also put a few commands of the form -list in 1/5- or somesuch inside the loop to see what changes are being made to the data at each step. Here's an example where it goes wrong: set seed 12345 clear input ID tx _t leadtime 1 1 1.0 0.4 2 1 1.2 0.3 11 0 .3 . 12 0 0.9 . 13 0 1.1 . 14 0 0.5 . end gsort -tx -leadtime g obs=_n qui levelsof obs if tx==1, loc(is) g match=. qui foreach i of local is { sort obs local m=ID[`i'] local lt=leadtime[`i'] g u=uniform() if mi(match) & tx==0 & _t>=`lt' sort u replace match=`m' in 1/2 drop u } li which can be sorted out with a simple trick: set seed 12345 clear input ID tx _t leadtime 1 1 1.0 0.4 2 1 1.2 0.3 11 0 .3 . 12 0 0.9 . 13 0 1.1 . 14 0 0.5 . end gsort -tx -leadtime g obs=_n qui levelsof obs if tx==1, loc(is) g match=. qui foreach i of local is { sort obs local m=ID[`i'] local lt=leadtime[`i'] g u=uniform() if mi(match) & tx==0 & _t>=float(`lt') sort u replace match=`m' in 1/2 drop u } li but even better would be set seed 12345 clear input ID tx _t leadtime 1 1 1.0 0.4 2 1 1.2 0.3 11 0 .3 . 12 0 0.9 . 13 0 1.1 . 14 0 0.5 . end gsort -tx -leadtime g obs=_n qui levelsof obs if tx==1, loc(is) g match=. qui foreach i of local is { sort obs local m=ID[`i'] local lt=leadtime[`i'] g u=uniform() if mi(match) & tx==0 & _t>=float(`lt') sort u, stable assert u<. in 1/2 replace match=`m' in 1/2 drop u } li (see -help sort- and -help assert- for details). * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: selecting random matched controls: survival***From:*Michael McCulloch <mm@pinest.org>

**References**:**Re: st: selecting random matched controls: survival***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**Re: st: "analytics"** - Next by Date:
**Re: st: selecting random matched controls: survival** - Previous by thread:
**Re: st: selecting random matched controls: survival** - Next by thread:
**Re: st: selecting random matched controls: survival** - Index(es):

© Copyright 1996–2022 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |