[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: Using weights from the Current Population survey
A better option than using iweight or aweight with -tabulate- is to use
-svytab- and pweight. It has several advantages, one of which is that gives
you asymptotic 95% CIs that will not cross 0 or 1 (which is good because you
can't have less than 0 people enter a survey response, nor more than 100%).
And to head off the next question, -svytab- won't allow you to specify only
one variable in the command statement. If you want to tabulate the responses
to a single question you have to create a constant (e.g., gen dum=1) and use
it as your second variable, as in:
svytab q1 dum, [options - lots of them]
Also, make sure you use the -subpop- option to specify subpopulations, not
-if- or -in-.
And finally, you absolutely have to use the weights provided if you want the
results to be representative of the population and not just the sample. I
don't know about the CPS specifically, but it is unlikely that you'll be
provided with the psu and strata information because in well-conducted
surveys the clusters sampled are usually quite small and it would be
possible for a determined analyst to identify individuals - especially in
lightly-populated areas. To get around this problem in Canada, at least, for
large government surveys we are provided with a data set of bootstrap
weights from which to calculate bootstrapped standard errors. If you do not
have access to the psu and strata information, then you might enquire if
such a beast is available for the Current Population Survey.
Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
999 Balmoral Street
Thunder Bay, Ontario
Canada P7B 6E7
Tel: +1 (807) 625-5957
Fax: +1 (807) 623-2369
> -----Original Message-----
> From: David Kantor [SMTP:firstname.lastname@example.org]
> Sent: Wednesday, July 03, 2002 1:53 PM
> To: email@example.com
> Subject: Re: st: Using weights from the Current Population survey
> At 12:18 PM 7/3/2002 -0500, Nammi Kandula wrote:
> >I am doing an analysis with the MARCH current population survey.
> >I am doing a person-level analysis.
> >If I use the wgt varaible, stata asks me what kind of weight this is.
> >Is it an analytic weight, pweight, fweight?
> >Should I use the weight in my regressions, or in my tab commands. Do i
> >to transform the weight in any way?
> My experience is that all weights in surveys from the U.S. Census Bureau
> are pweights.
> You should specify them as pweights in regressions. (If you use them as
> aweight, the coefficients will be the same, but the variances and
> confidence intervals will be wrong. See the section in the user guide on
> Estimation, Weighted Estimation; that's U 26.12 in my ancient V5 manual.)
> Better -- use svyreg and specify the strata and psu, if these are
> identified. Still, specify the weight as pweight.
> For -tabulate-, pweight is not accepted. Use aweight or iweight; the
> proportions will not be affected by the choice, but iweight has the
> advantage that the "Freq." will be the weighted sums of the observations
> i.e., the estimated number of actual population in the given category. Be
> sure you have scaled the weight correctly at the outset, if there are any
> implied decimals in the raw data.
> I hope this helps.
> -- David K.
> David Kantor
> Institute for Policy Studies
> Johns Hopkins University
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: