Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: re: Sample with weights


From   Mike Lacy <Michael.Lacy@colostate.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: re: Sample with weights
Date   Sat, 01 Oct 2005 09:04:56 -0600

Date: Fri, 30 Sep 2005 11:48:40 +0200
From: "Willard van Ooij" <w.van.ooij@marktmonitor.com>
Subject: st: Sample with weights

Dear statalisters,

I have a population of company's. I want a sample from this population,
but the probability of a company to be sampled has to be equivalent with
the number of employees (let's call this "size").

So I thought i could
- -sample 10 [fweight=size]

To get a 10 percent sample. But -sample- doesn't accept weights. A
solution might be to expand the dataset with size, but then a company
can get sampled several times if it has more than one employee, and I
don't want that to happen. I think there must be a very simple solution,
but I haven't been able to find it.
This is simple, produces a sample of exactly the desired size,
and I believe fulfills the condition of the probability of selection
being proportional to size .
*Assume "Size" is the company size variable, and M is the desired sample size
gen ppsorder = uniform() * Size
sort ppsorder
keep if _n <= M
drop ppsorder

Yes, sorting the file is a bit clumsy, but this is presumably a one time thing,
not something appearing inside a loop.

Regards,


=-=-=-=-=-=-=-=-=-=-=-=-=
Mike Lacy
Fort Collins CO USA
(970) 491-6721 office






*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index