# Re: st: how to generate the weights after sample and assign svyset

 From "Stas Kolenikov" To statalist@hsphsun2.harvard.edu Subject Re: st: how to generate the weights after sample and assign svyset Date Sun, 11 Mar 2007 20:15:26 -0500

```On 3/11/07, Anna Gueorguieva <anna_i_g@yahoo.com> wrote:
```
```1. I used the following code to generate my sample schools within each strata:

set seed  123456789
sample 4.28, by(lgacode numschools)

I was aiming to do a probability proportional to size (pps) sample but
I do not think this is it (correct me if I am wrong).
```
```This is not PPS -- there is a couple of implementations out there. Try
-findit pps- to locate some. I know that mine is not quite proper, but
rather approximate -- at the time of writing it, I was not aware of
all the complications of the PPS sampling, which are many.

```
```How does the by statement affect my sampling weights?

I think I just did simple random sampling and my code should be:
gen weight=1/sampling_probability=1/(.0428*numschools)
```
```The -by- statement does not affect the weights, but it affects your
sample size, in the end. Your -by- variables are becoming strata for
you would really need that if you shall be estimating the totals, like
the number of students enrolled over the whole population. It will not
matter that much when you will be estimating fractions, ratios,
regression lines.

```
```2. After the schools are sampled, teachers are sampled systematically:
One teacher within each class level as the teacher might be selected
as the first, last or middle name in an alphabetized list.
```
```Systematic samping is tricky -- technically, you cannot estimate
variances due to that stage, unless you take something randomly at
least twice.

```
```So my svyset statement for the school-level dataset should be:
svyset school_code [pw=weight], strata(strata) fpc(numschools_bystrata)
```