[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Complex survey with only sampling weights

From   Austin Nichols <>
Subject   Re: st: Complex survey with only sampling weights
Date   Fri, 29 May 2009 10:44:18 -0400

Fernando Terrés <> :
Apparently, you cannot cluster by PSU (602 undisclosed
municipalities), but you can by region (17 of those); is there a level
of geography between region and municipality identifiable in the
survey?  17 is too small a number of clusters to get a good VCE, but
you can't see the PSU; if there is something in between, it might give
very good results.

On Fri, May 29, 2009 at 9:54 AM, Michael I. Lichter
<> wrote:
> The study description suggests that this is a complex probability sample,
> but by failing to provide you with identifiers for the strata, PSUs, and
> secondary sampling units, the original researchers made it impossible for
> you to estimate the effects of stratification and clustering. This is a
> problem, whether the sample was a true probability sample or not; your
> standard errors will almost certainly be too small regardless of how you
> calculate them.
> The "sampling weights" appear to be poststratification weights based on
> external (census or other) estimates of true population values, rather than
> design-based probability weights. You can treat them as either pweights in
> "regular" Stata commands or as poststratification weights in -svy- commands
> and I think you will get the same answers either way, although if you use
> them as poststratification weights, you have to be more careful about
> subsetting.
> In any event, Ana is right; the failure of the researchers to give you
> enough information about the design and the weights is not a rationale for
> ignoring the weights, especially for simple tabulations.
> Michael
> Ana Gabriela Guerrero Serdan wrote:
>> the survey you describe is complex but it doesnt mean that is not random.
>> Its just that to save costs or to be sure that they do include specific
>> groups/workers they have done stratification and clustering.
>> You probably need to use svy commands in Stata. But his depends on what
>> you are intersting on estimating, for population totals and descriptives you
>> certainly would need.
>> SPSS version 12 has a complex samples options, so you would be able to get
>> this also in SPSS.
>> see svy commands in Stata
>> take a look at Cameron and Trivedi, microeconometricts,chapter on
>> stratified and cluster samples.
>> rgds, Gaby
>> --- On Fri, 5/29/09, [ISO-8859-1] Fernando Terrés
>> <> wrote:
>>> From: [ISO-8859-1] Fernando Terrés <>
>>> Subject: st: Complex survey with only sampling weights
>>> To:
>>> Date: Friday, May 29, 2009, 5:50 AM
>>> I need to analyze an official survey,
>>> with data on 11,054 workers, were the sampling design is according to the
>>> survey company:
>>> 'multistage, stratified by clusters, with random selection of both PSU
>>> (602 undisclosed municipalities), and secondary sampling units
>>> (undisclosed census sections), and the last sample units (workers) are
>>> selected by random routes and quotes'. They provide sampling weights
>>> that are (1681) unique values for each combination of gender (2), region
>>> (17), firm size (6), and economic activity (13).
>>> My question is very simple: is this a probabilistic
>>> sampling design? I suspect that it is not, but I cast some doubts because
>>> the
>>> documentation disclosed by the official bureau that commissioned the
>>> survey clearly insists on using the weights (they present a word document
>>> tabulating them), that are the only sampling information included in
>>> the SPSS files that they provide (this reinforces my doubts, because I'm
>>> using Stata 10, which correctly uses the sampling weights, while to my
>>> knowledge SPSS only uses frequency weights).
>>> Thank you in advance,
>>> Fernando.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index