[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: STATA and NHIS weight variables

From	Michael Drazer <[email protected]>
To	[email protected]
Subject	Re: st: STATA and NHIS weight variables
Date	Mon, 20 Jul 2009 13:48:40 -0500

Steve, Stas, Joao, and Nick: thank you for the help.

Stas, your understanding of the design agrees with my ownunderstanding, and the sample adult and sample adult cancer data dohave weights (both wtfa and wtfa_sa, with wtfa ~= wtfa_sa ) for theindividuals who completed the survey for the sample adult and cancerfiles. Adults not completing the cancer/sample adult surveys only havea wtfa weight.

I merged the three files because I created a new variable (referred tohere as variable "D") that was defined using data in each of the threefiles - so variable A was only located in the person file, whereasvariable B was only located in the Sample Adult file, and variable Cwas only located in the Cancer file. Variable D varies based upon thevalues of variables A, B, and C.

Re: the observations, some of the observations do overlap. Forexample, an observation for Adult "A" would be found in all threefiles (Person, Adult, Cancer) if they were randomly sampled, whereasan observation for Adult "B" would only be found in the Person file ifthey were surveyed for the Person file, but another adult (Adult "C",a spouse, relative, etc.) in the household was randomly sampled forthe adult/cancer surveys. Even so, the merge was successful (I mergedby household, family, and person record identifiers) and I have thecorrect # of total observations.

I suspect that the best way to address this issue, based on thehelpful feedback of the Statalisters (thank you for the help, again)would be to redefine my newly created variable "D" for only thosepersons who completed all three surveys, and then to perform myanalyses on that same subpopulation of persons who completed all threesurveys.

In response to Steve's comment, I'm using Stata 10, so you werecorrect in identifying that there is no "psu()" option - my commandwas out-of-date (I was using this resource from NHIS in writing thecommand http://www.cdc.gov/nchs/data/nhis/9705var.pdf) for mypurposes. The new command reads:

svyset psu [pweight=wtfa_sa], strata(stratum) vce(linearized)singleunit(missing)


Please correct me if this does not sound correct.

Thanks again for the consideration and help,

Michael


On Jul 20, 2009, at 12:22 PM, Stas Kolenikov wrote:

That's terrible -- they have a 100k input file with 2200+ lines, but
did not bother to put just one more line for the appropriate design
specification!

Here's my understanding of NHIS: the household was conducted, and data
were collected for each person (by self or proxy reporting) producing
the person file. The weights in the person file would essentially be
the household weights. An adult was sampled to provide more detailed
information; their weight will be higher than the person weight by the
factor equal to (# of people in the household). All sampled adults
should have reported on the cancer supplement, so the sample adult and
the sample adult cancer data should have the same weights.

What was it that you tried to achieve by merging the three files? My
understanding is that you should've received almost a block-diagonal
structure of your data: the variables from the person data go with the
person(=household) weights, the variables from the sample adult and
cancer data go with the sample adult weights, and the
observations/variables do not overlap. Am I wrong? If you want to mix
information from them in your analysis, then you will have a lot of
missing data from the unsampled adults who did not provide the
information for the sample and cancer data sets. If only do the
analysis using the person data, you should use the wtfa weights from
the person data. If your analysis uses sample adult and cancer data,
you should use wtfa_sa weights. I expect that it will be impossible to
meaningfully analyze the data from all three files simultaneously, but
I will likely be wrong with this.

On Mon, Jul 20, 2009 at 11:07 AM, Michael
Drazer<[email protected]> wrote:

Hello,
I am new to STATA and am doing some work with a dataset that Icreated bymerging together the 2005 sample adult, person, and sample adultcancer NHISdatasets. I'm trying to construct an appropriate survey designstatement,
but am not sure how to properly define the pweights for my dataset.
According to the Sample Adult variable layout, the wtfa_sa weightshould beused for most sample adult analyses - I noticed, however, that theSample
Adult dataset also has values for the wtfa weight. All of the 98,649
observations in my merged dataset have wtfa weights, but only those31,428
observations in the sample adult/sample adult cancer datasets contain
wtfa_sa weights. My question is: to properly weigh this data,should I use a
standard survey design statement, such as:

svyset [pweight=wtfa],strata(stratum)psu(psu)
even though it does not take into account the wtfa_sa weights, oris there a
better way to weigh this data?
For additional info re: variable layouts in the NHIS data, see here(theVariable layout PDFs are informative in terms of a description ofthe wtfaand wtfa_sa variables - see the person layout for wtfa and eitherthe sample
adult or sample adult cancer layout for the wtfa_sa variable):
http://www.cdc.gov/nchs/nhis/nhis_2005_data_release.htm

Thanks in advance,

Michael


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/




--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: STATA and NHIS weight variables
  - From: Michael Drazer <[email protected]>
- Re: st: STATA and NHIS weight variables
  - From: Stas Kolenikov <[email protected]>

Prev by Date: st: RE: RE: RE: Nested loops by observation
Next by Date: RE: st: FAQ [was: RE: RE: re: What to do about multiple observations ...]
Previous by thread: Re: st: STATA and NHIS weight variables
Next by thread: st: Nested loops by observation
Index(es):
- Date
- Thread