Marcella Sapun <msapun@jhu.edu> writes,
> I want to read randomly 10% of a data set that contains about 1 million
> records and 100 variables. How do I do that in stata?
Let's assume you are reading the data with -infile-. Then you could type
. infile ... if uniform()<=.10
That would read about 10% of the data. I recommend you set the random-number
seed first, so type
. set seed 493888736 <- choose a number at random
. infile ... if uniform()<=.10
Marcella said her dataset had roughly 1 million observations, so
10% is 100,000. If Marcella wanted to read every 10th observation of
the original, she could type
. infile ... if mod(_n,10)==0
That would read records 10, 20, 30, ...
-- Bill
wgould@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/