[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: creating sample weights
"Janelle Knox" <email@example.com>
Re: st: creating sample weights
Sat, 11 Aug 2007 17:28:08 +0100
Thanks a lot Steve! That is very helpful. I am fairly new to Stata,
and the dataset didn't have a weight built in. So I really appreciate
On 8/10/07, Steven Joel Hirsch Samuels <firstname.lastname@example.org> wrote:
> On Aug 10, 2007, at 10:53 AM, Janelle Knox wrote:
> > I am trying to create a sample weight for a dataset, which will
> > correct for variations in gender, age, etc from population means.
> > Does anyone know how to do this, or where I can find information for
> > setting up a sample weight.... pweight=?
> > Thanks,
> > Jane
> Jane, you cannot do it to match population means. However you can do
> it to match population percentages in different categories. This
> technique for this is known as "raking". In Stata this is available
> in Nick Winter's program -survwgt rake-. Type "ssc install
> survwgt". By the way, "pweight" is a stata reserved word.
> A good reference for practice is: http://www.abtassociates.com/
> Warning: If you are not experienced with weighting, you can run into
> many problems. Raking will not fix, and might even worsen, certain
> kinds of sample deficiencies. If you have followed recent
> discussions on Statalist, you will be aware that not everyone
> recommends weighting before doing regressions.
> You don't say if there is an existing "design weight". If so, I
> assume that it's name is "old_wt". Otherwise, define "old_wt=1"
> before running the survwgt program.
> 1. Create grouped versions of the variables you wish to match in your
> original data set.
> 2. Now create separate data sets for each characteristic that you
> wish to match, these will contain the adjusted totals for each
> characteristic Suppose your sample size is n=1,252. I will
> arbitrarily add or subtract 1 from the category numbered 1 for each
> characteristic to make sure your adjusted sample totals add up to the
> actual total. Below is an example for creating a data set
> "agedat.dta" which contains the age group totals.
> 3. Merge these into your original data. .
> 4. The rake instructions are then (for example):
> survwgt rake old_wt, by(race gender age_gp) totvars(race_tot
> gender_tot age_tot) generate(new_wt)
> 5. "new_wt" is your new weight variable. It will probably contain
> fractions, but these will not affect the regressions.
> /*CREATE AGE DATA SET WITH ADJUSTED TOTALS SO THAT SAMPLE & POP
> PERCENTS MATCH */
> local ssize=1252
> /* Gender Data Set: 1 10% 2 20% 3 50% 4 20% */
> input age_gp pop_pct
> 1 .1
> 2 .2
> 3 .5
> 4 .2
> gen age_tot=`ssize'*pop_pct
> table age_gp , c(sum age_tot) row
> sort age_tot
> save age_dat, replace
> /*****************CODE ENDS ***************************/
> Steven Joel Hirsch Samuels
> 18 Cantine's Island
> Saugerties, NY 12477
> Phone: 845-246-0774
> EFax: 208-498-7441
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: