Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: random samples within each of 1,152 categories


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: random samples within each of 1,152 categories
Date   Wed, 8 May 2013 10:00:38 +0200

On Wed, May 8, 2013 at 3:18 AM, Olga Gorbachev wrote:
> for each cell, I need to compute a fraction of not working females.  I
> generated the needed fraction (swr), like this:
<snip>
> I then  generated categories/cells using group command:
>
> egen x=group(ed wife nokid white year)
>
> this gave me 1,152 groups.
>
> Thus, for each x I have a number given by swr that tells me what
> percentage I want to sample. So for x=1, swr=43.2
>
> Thus, I'd like to 'designate' randomly for x=1 if work=1 43.2% of
> females to be not working. And I'd like to do this for each x.

See:
M.L. Buis (2007), "Stata tip 48: Discrete uses for uniform()", The
Stata Journal, 7(3), pp. 434-435.
<http://www.stata-journal.com/article.html?article=pr0032>

Here is an example that applies that logic to your problem:

*------------------ begin example ------------------
// data preparation
sysuse nlsw88, clear

gen byte occat = cond(occupation < 3                 , 1,      ///
                 cond(inlist(occupation, 5, 6, 8, 13), 2, 3))  ///
                 if occupation < .
label variable occat "occupation in categories"
label define occat 1 "high"   ///
                   2 "middle" ///
                   3 "low"
label value occat occat

gen byte edcat = cond(grade <  12, 1,     ///
                 cond(grade == 12, 2, 3)) ///
                 if grade < .
label define edcat 1 "less than high school" ///
                   2 "high school"           ///
                   3 "more than high school"
label value edcat edcat
label variable edcat "education in categories"

// define the sample
gen byte touse = !missing(race, edcat, occat, married)

// create the group indicator
egen group = group(race edcat occat) if touse

// create the proportion of married women per group
bys group : egen p = mean(married) if touse

// sample a new married variable
gen byte married_sim = runiform() < p if touse
*------------------- end example -------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )

Hope this helps,
Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index