st: can the total sampling weights be applied to subsample analysis

Mon, 24 May 2010

No, it just means, if anything, that you should not make much of small differences.. In the admittedly artificial example in L&L's book, the estimate of the subpopulation mean was $11.52, compared to the true value of $11.60. I can think of examples where a difference in the distribution of weights would be expected and would not lead to bias. No sampling text advises against using the sample weights, and I would use them. Note that if a subpopulation size is <20, then the standard errors that Stata reports will be untrustworthy. Steve Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA Voice: 845-246-0774 Fax: 206-202-4783 On Mon, May 24, 2010 at 3:37 PM, <jl591164@albany.edu> wrote: > Thanks, Steve. T-test and ranksum test indicate that the means of the > weights in the subpopulaiton and its complement are significantly > different. Does this mean that it is better not to apply the original > smaple weights to the subsample discriptive analysis? Thanks a lot. > > Junqing > > >> The subpopulation observations receive the original sample weights. >> These might not be appropriate for the subpopulation and can lead to >> bias (See an example in Levy and Lemeshow, Sampling of Populations, >> Wiley, 2008, pp. 147-148). There's not much that you can do about that >> without external information about the subpopulation. I speculate >> (but could be wrong!) that the bias arises when the probability of >> being a subpopulation member is correlated with the original weights. >> If so, you can check for this bias by plotting the subpopulation >> indicator against the weights with -ksm-. Or, more simply, just >> check whether the distributions of the weights in the subpopulation >> and its complement are different, >> >> Steve >> >> Steven Samuels >> sjsamuels@gmail.com >> 18 Cantine's Island >> Saugerties NY 12477 >> USA >> Voice: 845-246-0774 >> Fax: 206-202-4783 >> >> On Fri, May 21, 2010 at 1:40 PM, <jl591164@albany.edu> wrote: >>> Thanks. That is acturally what i did, useing survey set first, then svy, >>> subpop(). The subpop() option will use all cases in the calculation of >>> standard errors, but only the subsample in the calculation of the point >>> estimates. So, the total sampling weights will be used in the caculation >>> of standard errors of subsample. I have a follow up question. How the >>> total weights are applied to point estimates of the subsample by >>> subpop()? >>> >>> >>>> On Fri, May 21, 2010 at 11:32 AM, <jl591164@albany.edu> wrote: >>>>> My data provides a sampling weight to each id. But my study is based >>>>> on >>>>> a >>>>> subsample of the data becasue i selected cases by two variables: age >>>>> and >>>>> type of placement. Can I still apply the whole sample weights to my >>>>> subsample descriptive analysis? Thanks a lot. >>>> >>>> You should use the compete sample with -svy, subpop()- option. See >>>> http://www.stata-journal.com/article.html?article=st0153. -- Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA Voice: 845-246-0774 Fax: 206-202-4783 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

