Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Two different sampling strategies within the same survey

 From Etan Lakam <[email protected]> To [email protected] Subject Re: st: Two different sampling strategies within the same survey Date Mon, 3 Jan 2011 09:16:25 -0500

```Dear Steve,
Thank you for this thi very insightful answer.
Sorry for the late reply to this, this was due to an ill health
Etan

On Wed, Dec 1, 2010 at 1:50 PM, Steven Samuels <[email protected]> wrote:
> Etan-
>
> The -svyset- statement is not correct, nor is the count of stages
> (sub-sampling of women within a household adds a stage). Based on the
> information you've provided, I suggest the following:
>
> Let urban_rural be the variable that designates urban or rural. Define a new
> variable "psu". In the rural areas, psu = village ID. In the urban areas it
> is electoral ward. Then use the following -svyset- statement.
>
> svyset psu [pweight= final person weight], strata(urban_rural)
>
> Some thoughts:
>
> Why only the one stage of sampling?  Only one census tract was selected in
> each ward, so variance due to sampling of tracts is incorporated into the
> between-ward (psu) component of variance. Adding the HH and person sampling
> stages to -svyset- would not change stand errors by much (not at all, for
> some designs) and would not be accurate, because Stata assumes that there is
> simple random sampling at later stages.  In fact, many published data files
> from multi-stage samples take a similar -svyset- command, because they
> identify only the strata and primary sampling units.
>
> For analyzing individual responses, use the person weight, not the household
> weight. The two are not the same, because women were sampled within
> households.
>
> If one purpose of your study is to estimate descriptive statistics such as
> means or proportions and if the fractions of villages and wards selected
> were large, then include the fpc option in your -svyset- statement. When you
> do hypothesis testing and modeling, then omit the fpc in a second -svyset-
> statement.
>
> You give no details about how villages or electoral wards were selected. If
> any were selected with certainty-not sampled, but chosen ahead of time,
> designate these as "certainty" units in the -svyset- command.
>
> Use correct terminology.  I am guessing that you mean that when you say
> "systematic selection", you mean "systematic sampling".  The term
>  "selected" for PSUs and census blocks conveys no information about the
> sampling procedure.
>
> If you are uncertain about how to proceed, I suggest that you read the study
> documentation carefully and, if necessary, contact the study statistician. I
> also suggest you read a good sampling text, such as Sharon Lohr's Sampling:
> Design and Analysis.  If you are familiar with survey concepts, the Stata 11
> Manual entry for -svyset- is helpful for explaining the options in detail.
>
> Steve
>
> Steven J. Samuels
> [email protected]
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> Voice: 845-246-0774
> Fax: 206-202-4783
>
>
>
>
>
>
> On Nov 30, 2010, at 9:37 PM, Etan Lakam wrote:
>
> Dear Listers,
>
> I am grappling with how to set the sampling design in Stata using the
> svyset command. My problem is that within the same survey a two-stage design
> was used in rural areas and a three-stage design in urban areas.
>
> For the the two-stage design: villages were selected and then there was a
> systematic selection of households. For the three-stage design: electoral
> wards were selected, then a census block was taken from each ward and then
> household were selected and then women were systematically selected within
> each household.
>
> I have two queries:
> 1- Would the following command for setting the sampling design in the
> urban area be correct?
> *svyset electoral || census || household [pweight=house]*
> where electoral is the variable for electoral ward, census is the variable
> for census block and household is the variable for household, house is the
> household weight.
>
> 2- How do I account for the two sampling strategies as I would like to
> analyse the whole data: urban+rural? In other words, how do I set the
> overall sampling design using svyset?
>
>
> Etan
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```