# Re: st: svyset

 From Rebecca Pietrelli To statalist@hsphsun2.harvard.edu Subject Re: st: svyset Date Fri, 2 Nov 2012 13:36:45 +0100

```Thank you very much Stas Kolenikov!

So, if the systematc procedure had worked with replacement, I could simply use:

svyset psu_var [pw=hhweights_var], strata(strata1_var)

I don't need to create strata2.

I was wondering if I have to use the fpc. I know that it is necesary
to use it with systematic procedure. Now I am confused because the
first stage is a combined procedure (PPS + systematic) and the second
stage is a systematic procedure.

thank you
Rebecca
2012/11/2 Stas Kolenikov <skolenik@gmail.com>:
> This design took a lot of compromises. Systematic sampling is THE
> worst sampling procedure, and the only excuse for using it is that you
> margin of the printout of your list of EAs with cumulative population
> sizes. Still, systematic sampling could hit some large units (larger
> than {the total population size}/{# of units sampled}) more than once,
> so it could have worked effectively with replacement (which is what
> you want, without replacement is tedious to work with).
>
> The finite population correction is only relevant for SRS. It does not
> generalize well to PPS type designs, where, technically speaking, you
> should use double probabilities of selection (and if you find yourself
> doing that, you would recall that the systematic sampling does not
> allow unbiased variance estimation... that's why I said it is the
> worst method).
>
> You are correct regarding strata2: you'd have to create an indicator
> for the type of the HH.
>
> On Fri, Nov 2, 2012 at 6:06 AM, Rebecca Pietrelli
> <rebecca.pietrelli@gmail.com> wrote:
>> Hi,
>>
>> I really hope anyone can help me.
>> I am using the Stata command svyset in order to declare the survey
>> design to files of my dataset (Uganda Migration Housolds Survey 2010).
>>
>> The sampling design is two-stage stratified.
>> In the first stage: enumeration areas - EAs - (PSU) were selected,
>> separately for rural and
>> urban areas. The applied procedure is a PPS (proportionally done based
>> on the number of households in the respective stratum according to the
>> 2006 Uganda household survey) combined with a systematic approach.
>> In the second stage: households were selected in each EA, using a
>> systematic procedure (4 hhs with international migrants, 3 with
>> internal migrants and 3 without migrants).
>>
>> I think to use the following command:
>>
>> svyset psu_var [pw=hhweights_var], strata(strata1_var) fpc(fpc1) ||
>> _n, strata(strata2_var) fpc(fpc2).
>>
>> I have the following doubts:
>>
>> 1) I am not sure if the first stage is with or without replacement (It
>> is not mentioned in any part of the report!!!). I suppose that a PPS
>> combined with a systematic procedure is without replacement. Is this
>> assumption correct?
>>
>> 2) I have the strata1_var (rural or urban) but I don't have the
>> strata2_var. In that case, should I create it? (for ex. strata2 = 1 if
>> hh does not have migrants, = 2 if it has internal migrants and = 3 if
>> it has international migrants).
>>
>> 3) I don't have fpc in the data. I think to create them as
>> [(N-n)/(N-1)]^(1/2). So in the first stage, is N the total number of
>> EAs in Uganda or is it the total number of hhs living in Uganda?
>>
>> Thank you very much for your help and time.
>> Rebecca
>>
