[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: how to svyset for stratified multiple-stage cluster sampling in STATA

From	[email protected] (Jeff Pitblado, StataCorp LP)
To	[email protected]
Subject	Re: st: how to svyset for stratified multiple-stage cluster sampling in STATA
Date	Wed, 19 Apr 2006 17:52:46 -0500

Jian Zhang <[email protected]> has a -svyset- question:

> The sample was obtained as follows. I sampled the population by 
> stratifying it first, and then I randomly selected several clusters for 
> each stratum. Within each cluster, I then random selected several 
> subclusters, and then for each subcluster, I randomly selected a certain 
> number of observations.  for this sampling plan, how do I set up the 
> sampling plan using command svyset in STATA? 
> Would it be: 
> svyset [pweight = pwt], fpc(fpc) psu(cluster) strata(strata)?

I'll assume Stata 9, since this is the first release where -svyset- has a
syntax to deal with multiple stages of clustered sampling.

Let's make up some variable names to represent survey design characteristics:

pwt	- sampling weights

strata1	- stage 1 strata
su1	- stage 1 sampling units (PSU)
fpc1	- stage 1 finite population correction

strata2	- stage 2 strata
su2	- stage 2 sampling units (PSU)
fpc2	- stage 2 finite population correction

... you get the idea

Given Jian's description above, the -svyset- command should be as follows:

	svyset su1 [pw=pwt], strata(strata1) fpc(fpc1)		///
		|| su2, fpc(fpc2) || _n, fpc(fpc3)

(note: '///' tells Stata to continue to the next line in ado/do files.)

> I know this is for stratified TWO-stage cluster sampling plan, which is " 
> sample the population by stratifying it first, and then randomly select 
> several clusters for each stratum. Within each cluster, then randomly 
> select a certain number of observations."
>
> Would the svyset for multiple-stage cluster sample (more than  2 stages)
> with stratification be same as TWO-stage cluster sampling with 
> stratification?

Actually, Jian's original -svyset- command:

	> svyset [pweight = pwt], fpc(fpc) psu(cluster) strata(strata)

should not be used with a two-stage design because an -fpc()- was specified
but nothing was mentioned about the second stage.

Prior to Stata 9, -svyset- only allowed you to specify the first stage design
variables and we recommended that you omit the -fpc()- if the design involved
sampling within PSUs.  In Stata 9 you can specify the design variables for
each stage provided you have them, using '||' to delimit between the stages.

> More complicated is that what if I do cluster sampling first, and then 
> stratify each cluster, and then do cluster sampling again, what would  
> the command svy for setting up this sampling plan be? 

In this case Jian stratified in the second stage, so Jian should have a
variable like 'strata2' instead of 'strata1':

	svyset su1 [pw=pwt], fpc(fpc1)				///
		|| su2, strata(strata2) fpc(fpc2) || _n, fpc(fpc3)

> Similarly, if I stratify the population first, and then do the cluster, 
> and then do stratification again and then do cluster sampling again, what 
> would the svyset command be for this sampling plan?

	svyset su1 [pw=pwt], strata(strata1) fpc(fpc1)		///
		|| su2, strata(strata2) fpc(fpc2) || _n, fpc(fpc3)

> To generalize the question, if we change the order of cluster sampling 
> and stratification sampling when sampling the population, would the 
> svyset command be different? 

Yes.

In Stata 9, you need to know from which stage a stratum variable identifies
the strata.  See -[SVY] svyset- for more examples of how to -svyset-
multi-stage designs.

Prior to Stata 9, you would only use the -strata()- option if your design had
stratification in the first stage.

--Jeff
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: timeout option
Next by Date: Re: st: off-topic: masters degree in biostatistics online?
Previous by thread: st: how to svyset for stratified multiple-stage cluster sampling in STATA
Next by thread: st: mata courses?
Index(es):
- Date
- Thread