[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Copeland, Laurel" <Laurel.Copeland@med.va.gov> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
st: NHSDA data, accounting for the sampling design |

Date |
Tue, 26 Aug 2003 10:01:20 -0700 |

I am working with the 1997 NHSDA dataset (funded by SAMDHA) from UM's ICPSR archives (choose the 1997 link from http://www.icpsr.umich.edu:8080/ICPSR-SERIES/00064.xml ; codebook available as PDF from http://www.icpsr.umich.edu/cgi/archive.prl?study=2755&path=SAMHDA ). The survey dataset is produced by RTI. RTI uses their SUDAAN software to analyze it, taking into account the complex sampling design. Briefly, this consisted of first-stage determination of 43 certainty PSUs stratified into 5 race-related strata, plus a large uncertainty stratum for the remainder of the US from which noncertainty PSUs were selected; second-stage segment sampling within PSUs; and third-stage unit-listing. Each respondent ends up with a sampling weight (ANALWT) and two variables representing the nested strata (VESTR and VEREP). An example in the codebook shows this SUDAAN code where the first 3 lines seem to specify the design: PROC DESCRIPT DATA = "D:\NHSDA97" FILETYPE=SAS DESIGN=WR; NEST VESTR VEREP; WEIGHT ANALWT; VAR MRJFLAG; SUBGROUP CATAGE SEX RACE; LEVELS 4 2 4; TABLES CATAGE*(SEX RACE); SETENV DECWIDTH=6 COLWIDTH=17; PRINT NSUM WSUM MEAN SEMEAN SETOTAL/ NSUMFMT=F8.0 WSUMFMT=F12.0 MEANFMT=F15.10 SEMEANFMT=F15.10 SETOTALFMT=12.0 OUTPUT NSUM WSUM MEAN SEMEAN SETOTAL/ NSUMFMT=F8.0 WSUMFMT=F12.0 MEANFMT=F15.10 SEMEANFMT=F15.10 SETOTALFMT=F12.0; There is an accompanying description as follows: "For use with software such as SUDAAN, two variables were created: VESTR and VEREP. The sampling design used to select the NHSDA results in a deeply stratified sample. Therefore, adjacent strata are collapsed into pairs to create pseudo-strata (VESTR) with two replicates each (VEREP). For all noncertainty strata, the PSU's (each of which represents an implicit stratum) are grouped into pairs based on their sequential order of selection. Each pair of PSUs defines a pseudo-stratum (VESTR) with two replicates (VEREP). For the certainty portion of the sample, segments represent the first stage of sampling. Each explicit design stratum is partitioned into groups of approximately 24 segments based on order of selection (e.g., about the size of a non-certainty pseudo-stratum). These sets of approximately 24 segments define pseudo-strata (VESTR) for analysis purposes. The segments are then paired in selection order within each certainty pseudo-stratum. One segment from each pair is randomly assigned to replicate 1 and the other segment to replicate 2 (VEREP)." I have never used SUDAAN (nor do I have access to it), I have not needed the few SAS SURVEY commands available in regular SAS, and I am generally unfamiliar with setting up Stata (-svyset-) to handle this design. I found some publications on the web that analyzed the NHSDA data with Stata, so I was encouraged by that. I got some information from a SAMDHA research associate at ICPSR, but that person was uncertain of how to set up Stata for this dataset. I have found that if I use the -svyset- and -svymean- in Stata I can get: svyset pweight ANALWT svyset strata VESTR svyset psu VEREP sum IRAGE to replicate the output I get from running SAS SURVEYMEANS: PROC SURVEYMEANS DATA = NHS2.NHS97; CLUSTER VEREP; STRATA VESTR; WEIGHT ANALWT; VAR irage; WHERE MDESFS3>. AND snufever>.;*to match my subsetted ds in Stata; RUN; Based on this, the SAMDHA res. associate wrote: "Laurel these specifications appear to be correct; I think it's safe to assume your stata settings are solid. According to my documents, Sas commands map to the following stata commands: Stratum == svyset strata Cluster == svyset psu weight == svyset pweight " Can anyone confirm this or offer me more certain translation of the NHSDA design into Stata parameters? Thank you, Laurel Laurel A Copeland, PhD VA Ann Arbor Health System (734) 769-7100 x6206 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: NHSDA data, accounting for the sampling design***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**RE: st: newbie- syntax question** - Next by Date:
**Re: st: stat-transfer updates** - Previous by thread:
**st: NetCourses 101 and 151** - Next by thread:
**st: RE: NHSDA data, accounting for the sampling design** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |