Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Richard Williams <richardwilliams.ndu@gmail.com> |

To |
statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Creating a smaller dataset from a larger one. |

Date |
Mon, 13 Aug 2012 16:04:00 -0500 |

At 10:47 AM 8/13/2012, Le Wang wrote:

Dear Amal, Stata has a built-in program called -sample- to draw a random sample. See the link below for a detailed tutorial for this command. http://www.ats.ucla.edu/stat/stata/faq/sample.htm Hope that helps. Le

On Mon, Aug 13, 2012 at 10:31 AM, Amal Khanolkar <Amal.Khanolkar@ki.se> wrote: > Hello all, >> I have a very large dataset with almost 3 million subjects -great to work with, but however a bit difficult to transport orcarry with me. I prefer to create a smaller sub-dataset with say100,000 subjects chosen at random. As I'm interested in studyingethnic differences, I use the variable 'Motherland' that denotescountry of birth in the code below to help create my sub-dataset.However, the code I'm currently using, I get (I think) the first100,000 subjects, which is then not at random. How may I change thecode below, to choose 100,000 (or say any number I wish) subjects at random?> > I use the following code to create a subset of my original dataset: > > *Creating a subsample of the dataset with say 100,000 subjects* > > // create random variable > gen x = runiform() > > // sort by country and x > sort motherland x >> // create a variable within country identifying the first 10%(change this proprtion as you wish)> > by motherland: gen subsamp = _n <= (_N+0.5)*0.10 > > tab motherland subsamp, col >> tab motherland kon, col, if magecat!=. & education!=. &famsit_new!=. & smoke1!=. & parity!=. & zscore_gest!=. & MBMI2!=. &mlangd!=. & multibirth==2 & subsamp==1> > > Thanks for any help, > > /Amal. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- ~~~~~~~~~~~~~~~~~~~~~~~~ Le Wang, Ph.D Assistant Professor Department of Economics University of New Hampshire * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

------------------------------------------- Richard Williams, Notre Dame Dept of Sociology OFFICE: (574)631-6668, (574)631-6463 HOME: (574)289-5227 EMAIL: Richard.A.Williams.5@ND.Edu WWW: http://www.nd.edu/~rwilliam * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Creating a smaller dataset from a larger one.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**Re: st: Creating a smaller dataset from a larger one.***From:*Le Wang <statauser@gmail.com>

- Prev by Date:
**Re: st: Creating a smaller dataset from a larger one.** - Next by Date:
**st: ado file help** - Previous by thread:
**Re: st: Creating a smaller dataset from a larger one.** - Next by thread:
**Re: st: Creating a smaller dataset from a larger one.** - Index(es):