Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Randomly Sample Data w/o replacement by a Variable


From   Howard Lempel <HLempel@brookings.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: Randomly Sample Data w/o replacement by a Variable
Date   Thu, 25 Jun 2009 11:52:50 -0400

Hello all,

I'd like to get a simple random sample of X% (weighted) of different subsamples of my data without replacement and without dropping the observations that were not selected.  

For example, using the auto dataset, I'd like to create a new variable called "sample" which is equal to 1 for a randomly selected 75% of foreign cars (weighted by weight) and 75% of domestic cars and equal to 0 for all other cars.

Here's my current attempt, which requires -xtile2- (SSC).  Does anyone know if there is a way to do this in one line?

**** Start Code *****
sysuse auto
set seed 4635
gen random = uniform()
xtile2 rank = random [aw=weight], nq(4) by(foreign)

*Pick 75% for sample
gen sample = (rank<4)
****** End Code *********

Thanks for your consideration.
Howie

Howie Lempel
Research Assistant
The Brookings Institution | Economic Studies
 
1775 Massachusetts Ave NW | Washington DC 20036
hlempel@brookings.edu | p: (202) 238-3576
 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index