Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Weighted counts with "svy" command |

Date |
Fri, 16 Sep 2011 09:05:31 -0400 |

Shige- My guess is that you are accustomed to surveys in which the sampling weights have been normalized to sum to sample size. These are still issued with survey data sets such as the Demographic and Health Studies (where the given weights must first be divided by 1,000,000). For estimating means, including proportions, and regression coefficients, the normalization does not matter. However the weighted category counts in such data sets are meaningless. In NHIS, the sum of the original sampling weights do not sum exactly to the US category totals. Post-sample adjustments are applied so that weighted sample totals match US population totals for age, race/ethnicity, and sex. See: http://www.ihis.us/ihis/userNotes_weights.shtml. Steve On Sep 15, 2011, at 10:26 AM, Austin Nichols wrote: Shige Song <shigesong@gmail.com>: Weighted counts *should* sum to the population size. Perhaps you want to treat your pweights as aweights, in which case you would get the sum as the number of obs? But you claim that is not what you want, since the obs option does not give the desired total. Your desideratum is very mysterious. Maybe you want the product of weighted proportions and the number of obs as returned by tabulate treating pweights as aweights? Why would you want that? webuse nhanes2 svy:tab region, count se mata: sum(st_matrix("e(b)")) svy:tab region, obs mata: sum(st_matrix("e(Obs)")) tab region [aw=finalwgt] *same as: svy:tab region mata: st_matrix("e(b)"):*sum(st_matrix("e(Obs)")) On Thu, Sep 15, 2011 at 10:10 AM, Shige Song <shigesong@gmail.com> wrote: > Hi Steve, > > The weighted counts that we are getting with svy syntax are in the > millions (222,760,817)--these are for the whole U.S. population. We > want weighted counts for our sample (approximately 300,000 cases). > > Thanks. > > Shige > > On Wed, Sep 14, 2011 at 5:04 PM, Steven Samuels <sjsamuels@gmail.com> wrote: >> >> What would weighted counts look like that are not the population counts? I can't think of any, so please supply an example. >> >> >> Steve >> >> On Sep 14, 2011, at 10:19 AM, Shige Song wrote: >> >> Dear Colleagues, >> >> We are trying to do an descriptive table of basic socio-demographic >> and health characteristics of our 3 subpopulations of interest >> (African born, Latin American born, and US born) using the National >> Health Interview Survey (NHIS). (We're using a pooled file, >> 2005-2009.) In previous research we would simply use tabulate and >> show both the freq and % in our descriptive table. Now we're using >> the "svyset" command and then using "svy: tabulate nativity, count" to >> get the weighted counts in the dataset. However, this command gives >> the weighted counts in, apparently, the total population, not in the >> dataset. Do you know how to obtain the weighted counts in the dataset >> using "svy"? I also tried "svy: tabulate nativity, obs", but that >> gives us the unweighted number of observations. Please see the output >> below: >> >> Below, for reference, are the unweighted tabulations of our nativity >> groups in our 5-year pooled file. >> . tab nativity, m >> Nativity | Freq. Percent Cum. >> --------------------+----------------------------------- >> U.S. born | 231,546 77.02 77.02 >> Latin American born | 43,246 14.39 91.41 >> African Born | 1,857 0.62 92.02 >> Other | 23,982 7.98 100.00 >> --------------------+----------------------------------- >> Total | 300,631 100.00 >> >> >> And here are the weighted counts when we use the "svy" syntax, but >> they are apparently counts in the total population. We are looking >> for weighted frequencies in the dataset. >> . svy: tabulate nativity, count format(%14.3gc) >> (running tabulate on estimation sample) >> >> Number of strata = 639 Number of obs = 300631 >> Number of PSUs = 1278 Population size = 222760817 >> Design df = 639 >> ----------------------- >> Nativity | count >> ----------+------------ >> U,S, bor | 185,258,131 >> Latin Am | 20,152,746 >> African | 1,246,467 >> Other | 16,103,473 >> | >> Total | 222,760,817 >> ----------------------- >> Key: count = weighted counts >> >> And if we just use "svy: tabulate nativity" (with no option >> specified), we get only the cell proportions, although they are >> properly weighted. >> . svy: tabulate nativity >> (running tabulate on estimation sample) >> >> Number of strata = 639 Number of obs = 300631 >> Number of PSUs = 1278 Population size = 222760817 >> Design df = 639 >> >> ----------------------- >> Nativity | proportions >> ----------+------------ >> U,S, bor | .8316 >> Latin Am | .0905 >> African | .0056 >> Other | .0723 >> | >> Total | 1 >> ----------------------- >> Key: proportions = cell proportions >> >> >> We tried using "svy: tabulation nativity, obs percent", see below, and >> this gives us the weighted percents but the unweighted number of >> observations in each category. We have looked at Stata help for svy: >> tabulate, but can't seem to figure this out. We suspect it's simple. >> Does anyone know how to get the weighted counts in the dataset with >> svy: tabulate? >> . svy: tabulate nativity, obs percent format(%14.3gc) >> (running tabulate on estimation sample) >> >> Number of strata = 639 Number of obs = 300631 >> Number of PSUs = 1278 Population size = 222760817 >> Design df = 639 >> >> ------------------------------------ >> Nativity | percentages obs >> ----------+------------------------- >> U,S, bor | 83.2 231,546 >> Latin Am | 9.05 43,246 >> African | .56 1,857 >> Other | 7.23 23,982 >> | >> Total | 100 300,631 >> ------------------------------------ >> Key: percentages = cell percentages >> obs = number of observations >> >> Thanks so much for taking the time to look at this. >> >> Best, >> Shige * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Weighted counts with "svy" command***From:*Shige Song <shigesong@gmail.com>

**Re: st: Weighted counts with "svy" command***From:*Steven Samuels <sjsamuels@gmail.com>

**Re: st: Weighted counts with "svy" command***From:*Shige Song <shigesong@gmail.com>

**Re: st: Weighted counts with "svy" command***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**Re: st: chi square group test** - Next by Date:
**Re: st: Selecting a sample to compromise between significant size and geographical dispersion** - Previous by thread:
**Re: st: Weighted counts with "svy" command** - Next by thread:
**st: option endog in "xtivreg2,... fe"** - Index(es):