Thanks a lot, Steve. That is very helpful. Junqing > No, it just means, if anything, that you should not make much of small > differences.. In the admittedly artificial example in L&L's book, the > estimate of the subpopulation mean was $11.52, compared to the true > value of $11.60. I can think of examples where a difference in the > distribution of weights would be expected and would not lead to bias. > > No sampling text advises against using the sample weights, and I > would use them. Note that if a subpopulation size is <20, then the > standard errors that Stata reports will be untrustworthy. > > Steve > > Steven Samuels > sjsamuels@gmail.com > 18 Cantine's Island > Saugerties NY 12477 > USA > Voice: 845-246-0774 > Fax: 206-202-4783 > > > > On Mon, May 24, 2010 at 3:37 PM, <jl591164@albany.edu> wrote: >> Thanks, Steve. T-test and ranksum test indicate that the means of the >> weights in the subpopulaiton and its complement are significantly >> different. Does this mean that it is better not to apply the original >> smaple weights to the subsample discriptive analysis? Thanks a lot. >> >> Junqing >> >> >>> The subpopulation observations receive the original sample weights. >>> These might not be appropriate for the subpopulation and can lead to >>> bias (See an example in Levy and Lemeshow, Sampling of Populations, >>> Wiley, 2008, pp. 147-148). There's not much that you can do about that >>> without external information about the subpopulation. I speculate >>> (but could be wrong!) that the bias arises when the probability of >>> being a subpopulation member is correlated with the original weights. >>> If so, you can check for this bias by plotting the subpopulation >>> indicator against the weights with -ksm-. Or, more simply, just >>> check whether the distributions of the weights in the subpopulation >>> and its complement are different, >>> >>> Steve >>> >>> Steven Samuels >>> sjsamuels@gmail.com >>> 18 Cantine's Island >>> Saugerties NY 12477 >>> USA >>> Voice: 845-246-0774 >>> Fax: 206-202-4783 >>> >>> On Fri, May 21, 2010 at 1:40 PM, <jl591164@albany.edu> wrote: >>>> Thanks. That is acturally what i did, useing survey set first, then >>>> svy, >>>> subpop(). The subpop() option will use all cases in the calculation of >>>> standard errors, but only the subsample in the calculation of the >>>> point >>>> estimates. So, the total sampling weights will be used in the >>>> caculation >>>> of standard errors of subsample. I have a follow up question. How the >>>> total weights are applied to point estimates of the subsample by >>>> subpop()? >>>> >>>> >>>>> On Fri, May 21, 2010 at 11:32 AM, <jl591164@albany.edu> wrote: >>>>>> My data provides a sampling weight to each id. But my study is based >>>>>> on >>>>>> a >>>>>> subsample of the data becasue i selected cases by two variables: age >>>>>> and >>>>>> type of placement. Can I still apply the whole sample weights to my >>>>>> subsample descriptive analysis? Thanks a lot. >>>>> >>>>> You should use the compete sample with -svy, subpop()- option. See >>>>> http://www.stata-journal.com/article.html?article=st0153. > > -- > Steven Samuels > sjsamuels@gmail.com > 18 Cantine's Island > Saugerties NY 12477 > USA > Voice: 845-246-0774 > Fax: 206-202-4783 > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

