Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: sampling query |

Date |
Thu, 27 Jan 2011 16:54:54 -0500 |

Rich,

Steve Rich, Look up "multiple frames". That's a more common term for samples in which the ultimate unit can be reached through different trajectories (say landline phone sample, cell phone sample, and area/personal visit sample). The probabilities should be combined as 1 - Prob[ in the sample ] = product over k of (1-Prob[ reach the unit through the k-th frame ] ) which for small probabilities leads to sum of selection probabilities. You are totally right that the probability should go up rather than down. On Thu, Jan 27, 2011 at 10:37 AM, Richard Goldstein <richgold@ix.netcom.com> wrote:

all, I have received a report in which the report writer was stuck with thefollowing design (already implemented before his involvement): anumberof "cohorts" were set up (22 of them in fact) and the definitions ofthese cohorts were not mutually exclusive (i.e., there was someoverlapin membership so that a given observation could appear in more than 1 cohort); to calculate the probability weights, the report writer first calculated the probability of inclusion for each cohort (simply as n/Nwhere n is sample size from cohort and N is population size ofcohort).For observations in more than one cohort, who were actuallyselected, hethen multiplied the inclusion probabilities of each cohort that observations was in. Since each inclusion probability is less than 1, the combined inclusion probability is smaller than the individualinclusion probabilities for the individual cohort. And then, ofcourse,the weights are greater for these people (since the weight is just the inverse of the inclusion probability).However, since these observations are in more than one cohort,shouldn'tthe combined probability be greater for them (rather than smaller)? How should the combined inclusion probability be calculated? Or am I just wrong and the writer of the report is correct? Any references on dealing with overlapping "cohorts" would also be greatly appreciated. Rich * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

-- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: sampling query***From:*Richard Goldstein <richgold@ix.netcom.com>

**References**:**st: sampling query***From:*Richard Goldstein <richgold@ix.netcom.com>

**Re: st: sampling query***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**AW: st: RE: on restructuring of panel datasets** - Next by Date:
**Re: st: sampling query** - Previous by thread:
**Re: st: sampling query** - Next by thread:
**Re: st: sampling query** - Index(es):