At 02:00 PM 9/30/2005, you wrote:

Willard wrote: I have a population of company's. I want a sample from this population, but the probability of a company to be sampled has to be equivalent with the number of employees (let's call this "size").

I would take an approach like this:

(1) Calculate for each company the probability of inclusion. This is (sample size) * (size of company / total of company sizes). So assuming a sample size of 100:

. sum size

. gen prob = 100 * ( size / r(sum) )

(2) Then select the sample based on these probabilities

. gen u = uniform()

. gen insamp = u < prob

The problem is that this approach generates an *average* sample size of 100, but not exactly 100 in every case. Thoughts?

Nick Winter

