[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Steven Joel Hirsch Samuels <sjhsamuels@earthlink.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: creating sample weights |

Date |
Fri, 10 Aug 2007 14:31:43 -0400 |

On Aug 10, 2007, at 10:53 AM, Janelle Knox wrote:

Jane, you cannot do it to match population means. However you can do it to match population percentages in different categories. This technique for this is known as "raking". In Stata this is available in Nick Winter's program -survwgt rake-. Type "ssc install survwgt". By the way, "pweight" is a stata reserved word.I am trying to create a sample weight for a dataset, which will correct for variations in gender, age, etc from population means. Does anyone know how to do this, or where I can find information for setting up a sample weight.... pweight=? Thanks, Jane

A good reference for practice is: http://www.abtassociates.com/ presentations/raking_survey_data_2_JOS.pdf

Warning: If you are not experienced with weighting, you can run into many problems. Raking will not fix, and might even worsen, certain kinds of sample deficiencies. If you have followed recent discussions on Statalist, you will be aware that not everyone recommends weighting before doing regressions.

You don't say if there is an existing "design weight". If so, I assume that it's name is "old_wt". Otherwise, define "old_wt=1" before running the survwgt program.

1. Create grouped versions of the variables you wish to match in your original data set.

2. Now create separate data sets for each characteristic that you wish to match, these will contain the adjusted totals for each characteristic Suppose your sample size is n=1,252. I will arbitrarily add or subtract 1 from the category numbered 1 for each characteristic to make sure your adjusted sample totals add up to the actual total. Below is an example for creating a data set "agedat.dta" which contains the age group totals.

3. Merge these into your original data. .

4. The rake instructions are then (for example):

survwgt rake old_wt, by(race gender age_gp) totvars(race_tot gender_tot age_tot) generate(new_wt)

5. "new_wt" is your new weight variable. It will probably contain fractions, but these will not affect the regressions.

-Steve

/*CREATE AGE DATA SET WITH ADJUSTED TOTALS SO THAT SAMPLE & POP PERCENTS MATCH */

local ssize=1252

clear

/* Gender Data Set: 1 10% 2 20% 3 50% 4 20% */

input age_gp pop_pct

1 .1

2 .2

3 .5

4 .2

end

gen age_tot=`ssize'*pop_pct

list

table age_gp , c(sum age_tot) row

sort age_tot

save age_dat, replace

/*****************CODE ENDS ***************************/

Steven Joel Hirsch Samuels

sjhsamuels@earthlink.net

18 Cantine's Island

Saugerties, NY 12477

Phone: 845-246-0774

EFax: 208-498-7441

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: creating sample weights***From:*"Janelle Knox" <babeerage@gmail.com>

**References**:**st: creating sample weights***From:*"Janelle Knox" <babeerage@gmail.com>

- Prev by Date:
**Re: st: creating sample weights: Corrected** - Next by Date:
**st: interpret xtabond one-step results** - Previous by thread:
**Re: st: creating sample weights: Corrected** - Next by thread:
**Re: st: creating sample weights** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |