Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: using svyset with pooled cross-sections from IPUMS-CPS


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: using svyset with pooled cross-sections from IPUMS-CPS
Date   Tue, 26 Apr 2011 16:53:25 -0400

Patrick Lapid <patrick.lapid@gmail.com> :
Whether to weight or not is a weightier issue than you may suspect--on
how to cluster start with:
http://www.stata.com/statalist/archive/2010-04/msg00852.html
http://www.stata.com/statalist/archive/2008-04/msg00444.html

On Tue, Apr 26, 2011 at 4:02 PM, Patrick Lapid <patrick.lapid@gmail.com> wrote:
> Thank you, Stas. I'm not sure if declaring the survey design may be necessary,
> since I'm not necessarily generalizing the results to the total U.S. population.
> Should I just use regress with the vce(cluster) option? Would it be necessary to
> use pweight (sampling weights)?
>
> Best, Patrick
>
> ----Stas Kolenikov's <skolenik@gmail.com> reply----
>
> I don't think this is right. CPS is a rotating design, with the same
> households appearing (say) in February, March, April and May in both
> (say) 2010 and 2011. With your -svyset-, they would be treated as if
> they belonged to separate strata, which is not right (and
> counterproductive, actually: this design is optimized to have small
> standard errors on the measures of change, with 3/4 overlap between
> consecutive months, and 1/2 overlap between consecutive years, which
> helps bring down the variances by probably 20% and 10%, respectively,
> off the top of my survey statistician's intuition).
>
> ---end reply---
>
> On Tue, Apr 26, 2011 at 11:56 AM, Patrick Lapid <patrick.lapid@gmail.com> wrote:
>> I'm currently working on a labor economics project using the U.S. Current
>> Population Survey from 2006 to 2010, with the data downloaded from IPUMS.
>> I'm concerned if I've declared the survey design correctly. I am attempting to
>> analyze the data as pooled cross-sections, using survey estimation. I used
>> the following Statalist post as a guide:
>>
>> http://www.stata.com/statalist/archive/2008-10/msg00521.html
>>
>> I have the following lines of code to declare the survey design:
>>
>> . egen hhXyear = group(serial year)
>> . svyset hhXyear [pweight=perwt], strata(year)
>>
>> Since the original clusters (PSUs) were households, indexed by serial number,
>> I used egen to create new clusters of households in a given year. I then used
>> svyset to set the following:
>>
>> -clusters (PSUs): hhXyear
>> -strata: year (each separate year of the CPS)
>> -sampling weight: perwt (person weight, provided by IPUMS)
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index