Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# RE: st: SVY question

 From "Pavlopoulos, D." To "statalist@hsphsun2.harvard.edu" Subject RE: st: SVY question Date Mon, 29 Aug 2011 19:24:17 +0000

```Dear Steve,

thanks a lot!

best,
Dimitris

dr. Dimitris Pavlopoulos

Assistant Professor
Vrije Universiteit Amsterdam
Faculty of Social Sciences
dept. of Sociology
De Boelelaan 1081 (visiting address Metropolitan Building, Buitenveldertselaan 3)
1081 HV Amsterdam
the Netherlands

email: D.Pavlopoulos@fsw.vu.nl
tel: +31 (0)20 59 89254

________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Steven Samuels [sjsamuels@gmail.com]
Sent: 29 August 2011 03:38
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: SVY question

Correction:  To compute your own post-stratified weights:

Suppose the weighted proportion of workers in post-stratum k is p_k and that the proportion of workers in the population in stratum k is P_k, then create a new weight as
new_weight = myweight*(P_k/p_k)*((population total)/(sample sum of myweight))

The sample sum of the new_weight should equal the population total.

Steve

-
That's clear, Dmitris.

So ignore stratification at the worker stage.

****************************************
egen stratum = group(firm_size x_firm)
svyset firm_id [pweight = myweight] , strata(stratum)
***************************************

You must then construct your sampling weight as the product of two components: myweight = (1/p1)x(1/p2).

1. p1 =  Pr of select a firm  = 30/(no. of firms in the firm's stratum)
2. p2 = Pr of select a worker = (no. of workers/no. eligible), where the choices were made separately in each firm according to the selection plan you describe.

To check how well the estimated workers in each category matches the known numbers you should run

******************
svy: tab firm_size x_firm, cell
******************

Unfortunately, I believe that with this design, the numbers will not match well, with the bias possibly towards smaller firms. If this occurs, use the poststrata() and postweight() options in-svyset- to match the known numbers.

If you have information on individual firm sizes in the population, you can do better. Create a firm-size variable with more categories (e.g. 5-24, 25-99, 100-199, 200-299....). Estimate numbers of workers in the population by refined firm size category and firm_x and compare to the known numbers. If these differ, as I suspect they will, the post-stratify on these numbers instead.

If you intend to do longitudinal analyses with -xtmixed- in Stata 12, then you must compute the post-stratified probability weights yourself. Suppose the weighted proportion of workers in post-stratum k is p_k and that the proportion of workers in the population in stratum k is P_k, then create a new weight as new_weight = myweight*P_k/p_k.

Steve

On Aug 27, 2011, at 4:50 PM, Pavlopoulos, D. wrote:

Dear Steve,

thank you for your reply. Below you can find a description of my sampling:

- I select companies that do not have workers using the arrangement X. I exclude companies with less than 5 workers.
- I split the companies according to their size in groups having  5-24, 25-99, 99-499 and 500+ workers
- Within each of these groups I select a random sample of 30 companies.

- I select companies that have at least one worker using the arrangement X.  I exclude companies with less than 5 workers.
- I split the companies according to their size in groups having  5-24, 25-99, 99-499 and 500+ workers
- Within each of these groups I select a random sample of 30 companies.

```