Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: random sample


From   wgould@stata.com (William Gould, Stata)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: random sample
Date   Thu, 26 Oct 2006 10:20:40 -0500

Marcella Sapun <msapun@jhu.edu> writes, 

> I want to read randomly 10% of a data set that contains about 1 million
> records and 100 variables. How do I do that in stata?

Let's assume you are reading the data with -infile-.  Then you could type 

        . infile ... if uniform()<=.10

That would read about 10% of the data.  I recommend you set the random-number
seed first, so type 

        . set seed 493888736             <- choose a number at random
        . infile ... if uniform()<=.10


Marcella said her dataset had roughly 1 million observations, so 
10% is 100,000.  If Marcella wanted to read every 10th observation of 
the original, she could type 

         . infile ... if mod(_n,10)==0

That would read records 10, 20, 30, ...


-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index