Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: random sample


From   "Marcella Sapun" <msapun@jhu.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: random sample
Date   Thu, 26 Oct 2006 15:43:24 -0400

Bill,
Thank you for nice suggestion!
Marcella.


>>> William Gould, Stata <wgould@stata.com> 10/26/2006 11:20 AM >>>
Marcella Sapun <msapun@jhu.edu> writes, 

> I want to read randomly 10% of a data set that contains about 1
million
> records and 100 variables. How do I do that in stata?

Let's assume you are reading the data with -infile-.  Then you could
type 

        . infile ... if uniform()<=.10

That would read about 10% of the data.  I recommend you set the
random-number
seed first, so type 

        . set seed 493888736             <- choose a number at random
        . infile ... if uniform()<=.10


Marcella said her dataset had roughly 1 million observations, so 
10% is 100,000.  If Marcella wanted to read every 10th observation of 
the original, she could type 

         . infile ... if mod(_n,10)==0

That would read records 10, 20, 30, ...


-- Bill
wgould@stata.com 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html 
*   http://www.stata.com/support/statalist/faq 
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index