Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Richard Goldstein <richgold@ix.netcom.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: sampling query |

Date |
Thu, 27 Jan 2011 17:00:55 -0500 |

Steve, thank you (and, yes, I have a copy of Kish) Rich On 1/27/11 4:54 PM, Steven Samuels wrote: > > Rich, > > You must compute these probabilities for every member of the combined > sample, not just those selected in 2+ cohorts. If possible, your reading > should include Section 11.2 "Duplicate Listings; Overlapping Frames" of > Leslie Kish, Survey Sampling, Wiley, 1965. > > Steve > > > Rich, > > Look up "multiple frames". That's a more common term for samples in > which the ultimate unit can be reached through different trajectories > (say landline phone sample, cell phone sample, and area/personal visit > sample). The probabilities should be combined as > > 1 - Prob[ in the sample ] = product over k of (1-Prob[ reach the unit > through the k-th frame ] ) > > which for small probabilities leads to sum of selection probabilities. > You are totally right that the probability should go up rather than > down. > > On Thu, Jan 27, 2011 at 10:37 AM, Richard Goldstein > <richgold@ix.netcom.com> wrote: >> all, >> >> I have received a report in which the report writer was stuck with the >> following design (already implemented before his involvement): a number >> of "cohorts" were set up (22 of them in fact) and the definitions of >> these cohorts were not mutually exclusive (i.e., there was some overlap >> in membership so that a given observation could appear in more than 1 >> cohort); to calculate the probability weights, the report writer first >> calculated the probability of inclusion for each cohort (simply as n/N >> where n is sample size from cohort and N is population size of cohort). >> >> For observations in more than one cohort, who were actually selected, he >> then multiplied the inclusion probabilities of each cohort that >> observations was in. Since each inclusion probability is less than 1, >> the combined inclusion probability is smaller than the individual >> inclusion probabilities for the individual cohort. And then, of course, >> the weights are greater for these people (since the weight is just the >> inverse of the inclusion probability). >> >> However, since these observations are in more than one cohort, shouldn't >> the combined probability be greater for them (rather than smaller)? >> >> How should the combined inclusion probability be calculated? >> >> Or am I just wrong and the writer of the report is correct? >> >> Any references on dealing with overlapping "cohorts" would also be >> greatly appreciated. >> >> Rich * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: sampling query***From:*Richard Goldstein <richgold@ix.netcom.com>

**Re: st: sampling query***From:*Stas Kolenikov <skolenik@gmail.com>

**Re: st: sampling query***From:*Steven Samuels <sjsamuels@gmail.com>

- Prev by Date:
**Re: st: sampling query** - Next by Date:
**Antwort: st: Heckprob error: outcome = y_select > 0 predicts data perfectly** - Previous by thread:
**Re: st: sampling query** - Next by thread:
**st: Q1, Median & Q3 for pweighted svy data.** - Index(es):