Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: sampling query |

Date |
Thu, 27 Jan 2011 13:37:51 -0600 |

Rich, Look up "multiple frames". That's a more common term for samples in which the ultimate unit can be reached through different trajectories (say landline phone sample, cell phone sample, and area/personal visit sample). The probabilities should be combined as 1 - Prob[ in the sample ] = product over k of (1-Prob[ reach the unit through the k-th frame ] ) which for small probabilities leads to sum of selection probabilities. You are totally right that the probability should go up rather than down. On Thu, Jan 27, 2011 at 10:37 AM, Richard Goldstein <richgold@ix.netcom.com> wrote: > all, > > I have received a report in which the report writer was stuck with the > following design (already implemented before his involvement): a number > of "cohorts" were set up (22 of them in fact) and the definitions of > these cohorts were not mutually exclusive (i.e., there was some overlap > in membership so that a given observation could appear in more than 1 > cohort); to calculate the probability weights, the report writer first > calculated the probability of inclusion for each cohort (simply as n/N > where n is sample size from cohort and N is population size of cohort). > > For observations in more than one cohort, who were actually selected, he > then multiplied the inclusion probabilities of each cohort that > observations was in. Since each inclusion probability is less than 1, > the combined inclusion probability is smaller than the individual > inclusion probabilities for the individual cohort. And then, of course, > the weights are greater for these people (since the weight is just the > inverse of the inclusion probability). > > However, since these observations are in more than one cohort, shouldn't > the combined probability be greater for them (rather than smaller)? > > How should the combined inclusion probability be calculated? > > Or am I just wrong and the writer of the report is correct? > > Any references on dealing with overlapping "cohorts" would also be > greatly appreciated. > > Rich > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: sampling query***From:*Steven Samuels <sjsamuels@gmail.com>

**Re: st: sampling query***From:*Richard Goldstein <richgold@ix.netcom.com>

**References**:**st: sampling query***From:*Richard Goldstein <richgold@ix.netcom.com>

- Prev by Date:
**Re: st: Bootstrapped Paired-Samples (Dependent) T-Test** - Next by Date:
**st: Q1, Median & Q3 for pweighted svy data.** - Previous by thread:
**st: sampling query** - Next by thread:
**Re: st: sampling query** - Index(es):