[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: MORG data aggregation |

Date |
Thu, 10 Apr 2008 14:47:21 -0400 |

Jimmy Verner <jverner@earthlink.net>: That's the US Census Bureau you mean, I presume, and the survey is the Current Population Survey (CPS). You don't mention what years and months you are using--the file format changes over time. Individuals are weighted not "so that the data is nationally aggregated" but so that the data can be made (more or less) representative of the resident noninstitutionalized population. Jean Roth has a very nice collection of materials to begin with at http://www.nber.org/data/cps_index.html and you should also try: ssc install ddf2dct help ddf2dct The CPS has a somewhat odd hierarchical structure, with three different kinds of records stacked on top of each other. It's possible you have neglected to put the data in the household and family records into new variables, and drop those extraneous records. If so, you might see: +----------------------------+ | h_seq precord a_fnlwgt | |----------------------------| | 1 1 3200 | | 1 2 1 | | 1 3 628 | | 1 3 539.4 | | 1 3 506.81 | | 1 3 611.33 | | 1 3 491.08 | |----------------------------| | 2 1 3200 | | 2 2 2 | | 2 3 464.15 | |----------------------------| | 3 1 3200 | |----------------------------| | 4 1 3200 | | 4 2 2 | | 4 3 518.7 | | 4 3 534.48 | +----------------------------+ where the only real person records are those with precord==3, and adding up the false weights for observations with precord==1 or precord==2 would result in too-large estimated population sizes. If you look in e.g. http://www.nber.org/data/progs/cps/cpsmar07.do you will see a bunch of -replace- statements followed by a -keep if precord==3- (this is one way to turn the hierarchical file with 3 kinds of records into a person-level file). But the calculations below indicate that may not be the problem, and you may have other problems... perhaps you have a file with an implicit decimal point in the weight variable, and you have forgotten to divide by some power of ten (usually "two implied decimal places" so you must divide the weight by 100)? clear all qui infile using cpsmar07, using(cpsmar07.dat) replace gestfips=gestfips[_n-1] if precord>1 su a_fnlw if gestf==1, meanonly di %14.0f r(sum) 479069661 su a_fnlw if gestf==1 & precord==3, meanonly di %14.0f r(sum) 4555061 su a_fnlw if gestf==1 & inlist(pemlr,1,2,3,4), meanonly di %14.0f r(sum) 2180305 su a_fnlw if gestf==1 & precord==3 & inlist(pemlr,1,2,3,4), meanonly di %14.0f r(sum) 2180305 This last is the estimate of Alabama's labor force in March 07, about 2.2 million, and that estimate is not affected by having the HH and family records on the file. In general, the total labor force is about half the total population, and the latter numbers are available in published tables for you to check or at e.g. http://quickfacts.census.gov/qfd/ The CPS survey design variables are not on the public-use files, only weights, but you can get reasonable estimates with: egen psu=group(gestcen gtcsa) svyset [pw=mars], strat(gestcen) psu(psu) and see e.g. http://www.amstat.org/Sections/Srms/Proceedings/papers/1992_127.pdf for more detail. On Thu, Apr 10, 2008 at 12:28 PM, Jimmy Verner <jverner@earthlink.net> wrote: > The Census publishes monthly MORG files. Individuals are weighted so that > the data is nationally aggregated. I'm trying to pull monthly observations > by state from the files, but I'm not doing something right. I've tried the > various svy commands but my results just don't make sense (e.g., Alabama > does not have a labor force of 22 million!). > > Does anyone have any do files on this subject? Any other input would be > much appreciated. > > I'm running Intercooled Stata 8.0 on OS X 5. > > TIA. > > Jimmy Verner > Graduate Student > School of Economic, Political & Policy Sciences > University of Texas - Dallas * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: MORG data aggregation***From:*Jimmy Verner <jverner@earthlink.net>

**References**:**st: MORG data aggregation***From:*Jimmy Verner <jverner@earthlink.net>

- Prev by Date:
**st: Replicating a sas loop in stata results in very slow computation time....** - Next by Date:
**Re: st: list (if) (in)** - Previous by thread:
**st: MORG data aggregation** - Next by thread:
**Re: st: MORG data aggregation** - Index(es):

© Copyright 1996–2020 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |