Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: control selection - looping over observations not possible - alternatives


From   Maarten buis <[email protected]>
To   [email protected]
Subject   Re: st: control selection - looping over observations not possible - alternatives
Date   Tue, 26 Jan 2010 15:27:35 +0000 (GMT)

--- On Tue, 26/1/10, raoul reulen wrote:
> I need to select up to 20 controls for each of 10,000
> subjects from a dataset of around half-a-million
> subjects. The controls need to satisfy certain criteria
> (e.g., same age). How can I do this without having to
> loop over observations? 

What about this?

*-------------------- begin example ----------------------------
// prepare some toy data
// we want to find 2 controlls per treated observation 
// with the same patterns on x1 and x2 (if 2 such controls exist)
clear
set obs 1000
gen byte treat = _n <= 100
gen x1 = floor(runiform()*6)
gen x2 = floor(runiform()*6)

// find number treated observations per pattern in x1 and x2
bys x1 x2: gen long count = sum(treat)
by x1 x2: gen long Ntreat = count[_N] 
drop count

// touse is an indicator variable indicating that 
// that observatin is to be used
// we want to include all treated observations
gen byte touse = treat

// For each treated observation we find a maximum 
// of 2 controls with the same pattern in x
bys x1 x2 treat: replace touse = 1 if _n <= `= min(Ntreat[1]*2,_N)' 
*-------------------------- end example ----------------------------
( For more on how to use examples I sent to statalist see:
 http://www.maartenbuis.nl/stata/exampleFAQ.html )

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index