Fernando Terrés <fernando.terres@upc.edu> : Apparently, you cannot cluster by PSU (602 undisclosed municipalities), but you can by region (17 of those); is there a level of geography between region and municipality identifiable in the survey? 17 is too small a number of clusters to get a good VCE, but you can't see the PSU; if there is something in between, it might give very good results. On Fri, May 29, 2009 at 9:54 AM, Michael I. Lichter <MLichter@buffalo.edu> wrote: > The study description suggests that this is a complex probability sample, > but by failing to provide you with identifiers for the strata, PSUs, and > secondary sampling units, the original researchers made it impossible for > you to estimate the effects of stratification and clustering. This is a > problem, whether the sample was a true probability sample or not; your > standard errors will almost certainly be too small regardless of how you > calculate them. > > The "sampling weights" appear to be poststratification weights based on > external (census or other) estimates of true population values, rather than > design-based probability weights. You can treat them as either pweights in > "regular" Stata commands or as poststratification weights in -svy- commands > and I think you will get the same answers either way, although if you use > them as poststratification weights, you have to be more careful about > subsetting. > > In any event, Ana is right; the failure of the researchers to give you > enough information about the design and the weights is not a rationale for > ignoring the weights, especially for simple tabulations. > > Michael > > Ana Gabriela Guerrero Serdan wrote: >> >> the survey you describe is complex but it doesnt mean that is not random. >> Its just that to save costs or to be sure that they do include specific >> groups/workers they have done stratification and clustering. >> You probably need to use svy commands in Stata. But his depends on what >> you are intersting on estimating, for population totals and descriptives you >> certainly would need. >> SPSS version 12 has a complex samples options, so you would be able to get >> this also in SPSS. >> see svy commands in Stata >> take a look at Cameron and Trivedi, microeconometricts,chapter on >> stratified and cluster samples. >> rgds, Gaby >> >> --- On Fri, 5/29/09, [ISO-8859-1] Fernando Terrés >> <fernando.terres@upc.edu> wrote: >> >> >>> >>> From: [ISO-8859-1] Fernando Terrés <fernando.terres@upc.edu> >>> Subject: st: Complex survey with only sampling weights >>> To: statalist@hsphsun2.harvard.edu >>> Date: Friday, May 29, 2009, 5:50 AM >>> I need to analyze an official survey, >>> with data on 11,054 workers, were the sampling design is according to the >>> survey company: >>> 'multistage, stratified by clusters, with random selection of both PSU >>> (602 undisclosed municipalities), and secondary sampling units >>> (undisclosed census sections), and the last sample units (workers) are >>> selected by random routes and quotes'. They provide sampling weights >>> that are (1681) unique values for each combination of gender (2), region >>> (17), firm size (6), and economic activity (13). >>> My question is very simple: is this a probabilistic >>> sampling design? I suspect that it is not, but I cast some doubts because >>> the >>> documentation disclosed by the official bureau that commissioned the >>> survey clearly insists on using the weights (they present a word document >>> tabulating them), that are the only sampling information included in >>> the SPSS files that they provide (this reinforces my doubts, because I'm >>> using Stata 10, which correctly uses the sampling weights, while to my >>> knowledge SPSS only uses frequency weights). >>> Thank you in advance, >>> Fernando. >>> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

