Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: specifying SVYSET in household survey using multi-stage clustered sampling

From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: specifying SVYSET in household survey using multi-stage clustered sampling
Date   Fri, 1 Oct 2010 10:19:39 -0400


I found your description confusing. I want to reconstruct the survey
design in terms that I can understand, so I'll start with the basics.
Here's what I think you have done.  Please correct me if I

1) Your survey area is divided into regions

2) Every region had at least one camp.  You selected all camps into
the study and took a sample of HH from each.

3) In all regions, refugees could also live in "gatherings" outside
camps.   You selected a _sample_ of these gatherings in each region.
Within each selected gathering, you took a sample of HH.

Question: did you also study refugees who lived neither in camps or gatherings?

Question: within HH, did you obtain aggregate information, or
information about each member?

You have stated that one purpose of the study is obtain estimates for
each region. Are these primarily estimates of descriptive statistics
(e.g. proportions?)


Steven J. Samuels
[email protected]
18 Cantine's Island
Saugerties NY 12477
Voice: 845-246-0774
Fax:    206-202-4783

On Fri, Oct 1, 2010 at 2:22 AM, Karin Seyfert <[email protected]> wrote:
> Dear stata List,
> we have run a large household survey among refugees.
> Refugees live in clusters of camps or outside camp gatherings within
> several regions.
> We stratified our sample by 'camp' vs. 'outside camp gatherings' (1)
> and region (2).
> In strata (1) we under- and oversampled households to obtain robust
> regional estimates.
> Within strata (2), the camp/outside camp strata, we sampled households
> proportional to the share of households living inside or outside
> camps.
> We selected clusters within these two strata as follows:
> a) We selected all camps in all regions and
> b) a certain number of gatherings in all regions. Gatherings were
> selected with probabilities proportionate to their population within
> each region. They were sampled without replacement.
> Within the selected clusters, we used simple random sampling to select
> refugee households.  Within each cluster we sampled about 5-10% of the
> population. Since we are unsure about exact camp/gathering populations
> and we sample a small share, we assume sampling with replacement.
> I do have sampling weights (inverse probability of a HH being
> selected) and have adjusted for over- and under-sampling within the
> regional strata (variable called 'weights'). Some strata contain a
> singleton SU (one region has only one camp), which we treat as
> certainty units.
> I am unsure how to specify -svyset-. Below is how I think the response
> to -svydes- should look like. Does it look correct?  I would be
> grateful for help with the question marks below. I am also unsure what
> to specify as PSU, households or  clusters?
> pweight:        weights
>      VCE:        linearized
> Single unit:   certainty
>   Strata 1:     camp/gathering
>         SU 1:     ?
>    FPC 1:      ?
> Strata 2:      regions
>      SU 2:     households
>    FPC 2:     number of households per region
> I am sorry to take your time. I would really appreciate your help!
> Please also correct any mistakes or inconsistencies in my reasoning.
> Many Thanks
> Karin Seyfert
> PhD Candidate
> School of Oriental and African Studies
> University of London
> --
> Karin
> +961 71843862
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index