Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: control selection - looping over observations not possible - alternatives


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: st: control selection - looping over observations not possible - alternatives
Date   Tue, 26 Jan 2010 10:29:30 -0500

More details would be helpful, but using what you have...
Suppose you have a variable -subject- indicating (0/1) whether
an observation is a subject, and that -age- is an integer. Then

 bys age : egen numsubs=sum(subject)
 gen random=uniform()
 gen numcontrols=20*numsubs
 bys age subject (uniform): gen control=_n<=numcontrols&!subject

would do it. You can generalize this by creating a variable -group-
that stratifies subjects by your criteria.

hth,
Jeph



raoul reulen wrote:
Dear Statalisters

I need to select up to 20 controls for each of 10,000 subjects from a
dataset of around half-a-million subjects. The controls need to
satisfy certain criteria (e.g., same age). How can I do this without
having to loop over observations? Thanks.

Raoul
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index