[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Re: Using STATA to analyze complex national datasets
In the Sage book on sampling by Grahm Kalton he states:
"Under the with-replacement assumption a single standard error formula for a
particular estimator applies, no matter what form of subsampling is used
within the PSUs. Thus for instance, the same formula applies whether the
elements are sampled (1) by SRS within the selected PSUs, (2) by systematic
or stratified sampling, or (3) with further sampling stages and
stratification. This generality is appealing...buecause the user of the
program is not required to supply the program with details about the
subsample design. The use of these programs requires only that each survey
data record contains a code to indicate to which PSU it belongs, together
with information about the first-stage stratification."(p78)
Do we disagree?
Doesn't this cover many of the common designs for large national surveys?
to paraphrase Caleb Southworth's question - what are some examples of
when a multi-stage design can not be reduced to one stage using this
----- Original Message -----
From: "Caleb Southworth" <email@example.com>
Sent: Tuesday, February 04, 2003 10:35 PM
Subject: st: RE: Using STATA to analyze complex national datasets
> On Mon, 3 Feb 2003, Sayer, Bryan wrote:
> :documentation for them. Basically, the answer is no. Stata does only
> :level of sample design. So unless you can reduce a more complex sample
> :design down to one level, it is not possible in Stata. One issue in
> :simplifying the sample design is that you can get increased variability
> :the variance. So it isn't as simple as just using the highest level.
> :Perhaps if enough people lean on NCHS, they might come up with something.
> I think Bryan does an excellent job of raising the question: When can a
> two-stage design be reduced to one-level? A cursory search of the web
> shows lots of users collapsing two-stage designs into clusters and strata,
> My point here is not to single out a particular course webpage, but rather
> to highlight what appears to be a gernal problem.
> I don't know the NCHS data to which Joe refers, but I see this sort of
> problem all the time in the Russian Longitudinal Monitoring Survey (RLMS):
> analysts either ignore one level of clustering or treat a cluster as a
> strata. RLMS has a two-stage design in which it first selects geographic
> regions and then selects households. All adult members of the household
> are interviewed. So the question is: what is the implication of analyzing
> data from a two-stage cluster sample as cluster and strata? Or in STATA
> svyset strata region
> svyset psu household
> svyset pweight indwgt
> Another way to ask this question might be: Do strata have to be nested
> within clusters? Regions and households are both clusters, i.e. they are
> both "sampling unit[s] with which one or more listing units can be
> associated" (Levy and Lemeshow 1999, p. 266). Likewise, region would also
> seem to be a stratum, as in one of L mutually exclusive and exhaustive
> groups from which a simple random sample is drawn (Ibid., p. 121).
> Is this a reasonable way to collapse a two-stage design into one level for
> analysis with STATA's survey estimators?
> If the nested nature of the design is crucial, perhaps that could be
> addressed with HLM where we have two levels and clustering by households?
> gllamm [individual level variable] , i(region) cluster(household)
> This has the advantage of being able to specify weights at both levels and
> have a list of variables that define clusters. Comments?
> Dr. Caleb Southworth, Ph.D
> American Council of Learned Societies Research Fellow 2002-03
> Assistant Professor
> Department of Sociology
> 1291 University of Oregon
> Eugene OR 97403
> Work: (541) 346-5034
> Fax: (541) 346-5026
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: