Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten Buis <maartenlbuis@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Re: random samples within each of 1,152 categories |

Date |
Tue, 14 May 2013 10:16:06 +0200 |

On Wed, May 8, 2013 at 6:41 PM, Olga Gorbachev wrote: > Thank you very much for your reply and sorry for replying to you > directly, I couldn't figure out how to reply to list serve, since I > get digest version. Follow up questions must be sent to the Statalist. In this case I would just start a new question. This is not perfect but much better than reply to someone privately. > also, I don't subscribe to stata journal, so I couldn't read the > article you referenced, sorry. The article I refered to earlier has passed the moving wall, so you can read it free of charge even if you don't subscribe, just follow the link I gave you. (The article in question is: M.L. Buis (2007), "Stata tip 48: Discrete uses for uniform()", The Stata Journal, 7(3), pp. 434-435. <http://www.stata-journal.com/article.html?article=pr0032>) > can your example work with weights? I'd like to be able to match the > means of the distribution as well, and since it is survey data, sample > weights are important. For weights you need to use -collapse- to compute the weighted means, otherwise everything remains the same. See the example below: *------------------ begin example ------------------ // data preparation sysuse nlsw88, clear // create some "weights" gen w = 1/(.2 + .6*runiform()) gen byte occat = cond(occupation < 3 , 1, /// cond(inlist(occupation, 5, 6, 8, 13), 2, 3)) /// if occupation < . label variable occat "occupation in categories" label define occat 1 "high" /// 2 "middle" /// 3 "low" label value occat occat gen byte edcat = cond(grade < 12, 1, /// cond(grade == 12, 2, 3)) /// if grade < . label define edcat 1 "less than high school" /// 2 "high school" /// 3 "more than high school" label value edcat edcat label variable edcat "education in categories" // define the sample gen byte touse = !missing(race, edcat, occat, married) // create the group indicator egen group = group(race edcat occat) if touse tempfile temp save `temp' // create the proportion of married women per group collapse (mean) married [pw=w] if touse , by(group) merge 1:m group using `temp' assert _merge == 2 if touse == 0 assert _merge == 3 if touse == 1 drop _merge // sample a new married variable gen byte married_sim = runiform() < p if touse *------------------- end example ------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) -- Maarten --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Re: random samples within each of 1,152 categories***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: random samples within each of 1,152 categories***From:*Olga Gorbachev <olga.gorbachev@gmail.com>

- Prev by Date:
**Re: st: Re: Problem with variable names using Insheet** - Next by Date:
**st: Fwd: Simulating Multinomial Logit in Stata** - Previous by thread:
**Re: st: random samples within each of 1,152 categories** - Next by thread:
**Re: st: Re: random samples within each of 1,152 categories** - Index(es):