[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Steven Samuels <ssamuels@albany.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: weight |

Date |
Tue, 3 Jul 2007 17:16:57 -0400 |

On Jul 3, 2007, at 1:27 PM, Sebastian Kruk wrote:

What you want to do is called "poststratification."I have a Stata database and a Excell spreadsheet. In the first one, I have a survey microdata of households (v.g. income, quality of life, poverty, education, employment, housing and equipment). In the second one, I have a excell archive with population projections. I will like to use it to compute a weight for a data in Stata. How I form groups by ages and then I use a weight by group of age?

1. Create age group categories for your data to match categories in your spreadsheet.

Suppose the spreadsheet categories and corresponding population projection counts are:

20-29 140,000

30-39 120,000

40-49 300,000

50-59 200,000

60-69 100,000

70-100 50,000

In your Stata data set, create a grouped age variable with the same categories:

egen agegp=cut(age), at(20,30,40,50,60,70)

// Creates a grouped variable & names the categories for the LH endpoints

label define agegp 20 "20-29" 30 "30-39" 40 "40-49" 50 "50-59" 60 "60-69" 70 "70+"

label values agegp agegp

2. Create a variable in Stata to hold the projection counts. You can do this by hand or you can add an "agegp" column to the spreadsheet and use -insheet- to bring them into a stata data set, then -merge- with your data by agegp

By hand:

gen age_ct=140000 if agegp==20

replace age_ct=120000 if agegp==30

replace age_ct=300000 if agegp==40

.

.

replace age_ct=50000 if agegp==70

3. Finally, -svyset- your data. As Friedrich stated, there may already be a "weight" variable. There may also be a stratum variable, and primary sampling unit variable in your microsurvey. These should be documented in the survey codebook.

I assume you have all three, named "final_wt","stratum", "psu".

The -svyset- command is:

"svyset psu [pweight=finalt_wt], strata(stratum) poststrata(agegp) postweight(age_ct)"

4. If you you have separate population counts on age and gender, then you might create a combination age-gender variable and post-stratify on this 12 category variable. Be aware that if you have few people in some categories you may need to combine them.

For example:

gen age_gender = 21 if agegp==20 & gender==1

replace age_gender = 22 if agegp==20 & gender==2

etc.

-Steven

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**References**:**st: weight***From:*"Sebastian Kruk" <residuo.solow@gmail.com>

- Prev by Date:
**Re: st: RE: How to Calculate 25th and 75th Percentile in Stata** - Next by Date:
**st: RE: store frecuency values** - Previous by thread:
**Re: st: weight** - Next by thread:
**st: Specify coefficient vectors** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |