Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Two different sampling strategies within the same survey


From   Steven Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Two different sampling strategies within the same survey
Date   Wed, 1 Dec 2010 13:50:13 -0500

Etan-

The -svyset- statement is not correct, nor is the count of stages (sub- sampling of women within a household adds a stage). Based on the information you've provided, I suggest the following:

Let urban_rural be the variable that designates urban or rural. Define a new variable "psu". In the rural areas, psu = village ID. In the urban areas it is electoral ward. Then use the following -svyset- statement.

svyset psu [pweight= final person weight], strata(urban_rural)

Some thoughts:

Why only the one stage of sampling? Only one census tract was selected in each ward, so variance due to sampling of tracts is incorporated into the between-ward (psu) component of variance. Adding the HH and person sampling stages to -svyset- would not change stand errors by much (not at all, for some designs) and would not be accurate, because Stata assumes that there is simple random sampling at later stages. In fact, many published data files from multi-stage samples take a similar -svyset- command, because they identify only the strata and primary sampling units.

For analyzing individual responses, use the person weight, not the household weight. The two are not the same, because women were sampled within households.

If one purpose of your study is to estimate descriptive statistics such as means or proportions and if the fractions of villages and wards selected were large, then include the fpc option in your - svyset- statement. When you do hypothesis testing and modeling, then omit the fpc in a second -svyset- statement.

You give no details about how villages or electoral wards were selected. If any were selected with certainty-not sampled, but chosen ahead of time, designate these as "certainty" units in the -svyset- command.

Use correct terminology. I am guessing that you mean that when you say "systematic selection", you mean "systematic sampling". The term "selected" for PSUs and census blocks conveys no information about the sampling procedure.

If you are uncertain about how to proceed, I suggest that you read the study documentation carefully and, if necessary, contact the study statistician. I also suggest you read a good sampling text, such as Sharon Lohr's Sampling: Design and Analysis. If you are familiar with survey concepts, the Stata 11 Manual entry for -svyset- is helpful for explaining the options in detail.

Steve

Steven J. Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax: 206-202-4783






On Nov 30, 2010, at 9:37 PM, Etan Lakam wrote:

Dear Listers,

I am grappling with how to set the sampling design in Stata using the
svyset command. My problem is that within the same survey a two-stage design
was used in rural areas and a three-stage design in urban areas.

For the the two-stage design: villages were selected and then there was a systematic selection of households. For the three-stage design: electoral wards were selected, then a census block was taken from each ward and then household were selected and then women were systematically selected within
each household.

I have two queries:
1- Would the following command for setting the sampling design in the
urban area be correct?
*svyset electoral || census || household [pweight=house]*
where electoral is the variable for electoral ward, census is the variable for census block and household is the variable for household, house is the
household weight.

2- How do I account for the two sampling strategies as I would like to
analyse the whole data: urban+rural? In other words, how do I set the
overall sampling design using svyset?

Thank you in advance

Etan

*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index