[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Steven Joel Hirsch Samuels <sjhsamuels@earthlink.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Svy subsamples |

Date |
Sat, 24 Nov 2007 17:04:52 -0500 |

I believe that the questions of whether to include weights and of which weights to include, must be decided on a case-by-case basis. I might not use supplied weights in analyses of sub-populations or for certain analytic studies. I do not know that the reasons for my decisions not to weight would be considered prevalent concepts. In MLM, estimated variance components might be important, and I have not thought about weighting in connection with these. Certainly, a major reason that scientists do not consider whether to weight multilevel analyses is that their software cannot do so.

The reference below states that the following MLM packages can incorporate sampling weights: MPLUS, LISREL, MLWIN, and the contributed Stata program GLLAMM. Construction of the weights for these programs is not straight-forward, because different factors apply at each level of a hierarchical model. The reference describes, and contains links to, SAS and Stata programs that construct composite weights for two-level analyses.

Software to Compute Sampling Weights for Multilevel Analysis by Kim Chantala, Dan Blanchette, and C. M.Suchindran, 2006 ( http:// www.cpc.unc.edu/restools/data_analysis/ml_sampling_weights/Compute% 20Weights%20for%20Multilevel%20Analysis.pdf )

Steven Samuels

To Steven Samuel: First, let me apologize myself for "interfering" into your fascinating conversation. I wonder if it is your opinion that the same concepts are prevalent when some scientists do not take into account sampling weights when analyzing cluster samples arising from household survey with multilevel regression techniques, or it is simply that techniques for dealing with sampling weights are not available for multilevel regression? TIA, Moises Rosas --- Steven Joel Hirsch Samuels <sjhsamuels@earthlink.net> wrote:To Steven Samuel Forgive me for interferring your conversation withMr. RichardWilliams. However I'm dealing with a dataset consisting of10 subsamples withinformation collected over a period of 7 years. I was just wondering why you suggest to the ignorethe studyweights, especially if they werepost-stratified...?Regards, -- John Singhammer, Dr.phil, Mphil Dept. of Public Health Olof Palmes Allè 17 DK8200 Aarhus Tel: +45 8728 4715 Mobile phone: +45 2530 5768You are not interfering This is a conversation open to all. This is a slightly expanded version of what I sent to you privately. How to treat the subpopulation and weights depends on the purpose of the study. There is a Statalist thread which you can look up. First, note that the 'subpopulation' Richard's student wants to study is not a 'subsample'. I have sometimes taken 10 random subsamples of a single population to study variability between samples. This is the method of 'interpenetrating replicated subsamples' of Mahalanobis which was popularized by WE Deming in the 1950's(Sample Design in Business Research, Wiley, 1960). To expand on the reason for ignoring the subpopulation criterion. If Richard's student were to analyze the data as a subpopulation, then every sample mean have to be considered a ratio estimate, effectively analyzed with a 'ratio' procedure, which is what the 'subpop' option in the survey commands does. This is because the denominator in mean = (sum of X variable)/(no. of people in the subpopulation) would be considered a random variable. At an extreme, the very appearance of a subpopulation is a random event and the appropriate SE takes this into account. However it is likely that Richard's student is interested in the subpopulation as a way of studying a question unrelated to the original targt population--see below. In theoretical terms, she may want to study associations, conditional on membership in the subpopulation. To answer your question about weights. 1. If the purpose of a study is analytic (hypothesis testing, studying relations between variables) then Richard's student may not be really interested in the original target population. As an example, she might never report the weighted counts; she would report the sample counts for crucial variables. The only weights that I would suggest, if any, are those which correct for non-response and unequal probability of selection. 2. It may be better to consider the study as an 'experimental design', where population numbers of the experimental groups are not relevant. In Survey Errors and Survey Costs by R. Groves (Wiley Books), Groves posts the example of a study of noise in the vicinity of an airport. A study is to be done dividing the area around the airport into 'strata', which are zones at equal distance from the flight path or airport. An equal sample size is taken from each zone and the goal is to study relation of noise to distance. Of course most people in the study area will not live in the closest zones. A weighted analysis would give the closest people their population weight. This would be okay if the main goal was descriptive--to estimate the 'average' noise experienced by residents around the airport. However if you consider this an experimental design, then you want equal numbers at each dose, or, in fact, more at the extremes. Thus you would not apply the population weights. You may think this is an extreme case, but I have seen just this analysis in a published study of the association of gestational age to birth weight. Low birth weight infants were oversampled--they are only 5-10% of the population. Yet the analysts did the weighted analysis, which meant that the association in the vicinity of low birthweights was badly determined unless the model was correct. This is an ongoing debate among survey statisticians, so you will get different points of view. On Nov 21, 2007, at 3:08 PM, John Singhammer wrote:To Steven Samuel Forgive me for interferring your conversation withMr. RichardWilliams. However I'm dealing with a dataset consisting of10 subsamples withinformation collected over a period of 7 years. I was just wondering why you suggest to the ignorethe studyweights, especially if they werepost-stratified...?Regards, -- John Singhammer, Dr.phil, Mphil Dept. of Public Health Olof Palmes Allè 17 DK8200 Aarhus Tel: +45 8728 4715 Mobile phone: +45 2530 5768Steven Samuels sjhsamuels@earthlink.net 18 Cantine's Island Saugerties, NY 12477 Phone: 845-246-0774 EFax: 208-498-7441 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

Steven Samuels sjhsamuels@earthlink.net 18 Cantine's Island Saugerties, NY 12477 Phone: 845-246-0774 EFax: 208-498-7441 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Svy subsamples***From:*"Austin Nichols" <austinnichols@gmail.com>

**References**:**Re: st: Svy subsamples***From:*Statalist <morofe210-statalist@yahoo.com>

- Prev by Date:
**Re: st: optimal lag order in dynamic panel** - Next by Date:
**Re: st: Svy subsamples** - Previous by thread:
**Re: st: Svy subsamples** - Next by thread:
**Re: st: Svy subsamples** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |