Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sampling query

From   Stas Kolenikov <[email protected]>
To   [email protected]
Subject   Re: st: sampling query
Date   Thu, 27 Jan 2011 13:37:51 -0600


Look up "multiple frames". That's a more common term for samples in
which the ultimate unit can be reached through different trajectories
(say landline phone sample, cell phone sample, and area/personal visit
sample). The probabilities should be combined as

1 - Prob[ in the sample ] = product over k of (1-Prob[ reach the unit
through the k-th frame ] )

which for small probabilities leads to sum of selection probabilities.
You are totally right that the probability should go up rather than

On Thu, Jan 27, 2011 at 10:37 AM, Richard Goldstein
<[email protected]> wrote:
> all,
> I have received a report in which the report writer was stuck with the
> following design (already implemented before his involvement): a number
> of "cohorts" were set up (22 of them in fact) and the definitions of
> these cohorts were not mutually exclusive (i.e., there was some overlap
> in membership so that a given observation could appear in more than 1
> cohort); to calculate the probability weights, the report writer first
> calculated the probability of inclusion for each cohort (simply as n/N
> where n is sample size from cohort and N is population size of cohort).
> For observations in more than one cohort, who were actually selected, he
> then multiplied the inclusion probabilities of each cohort that
> observations was in. Since each inclusion probability is less than 1,
> the combined inclusion probability is smaller than the individual
> inclusion probabilities for the individual cohort. And then, of course,
> the weights are greater for these people (since the weight is just the
> inverse of the inclusion probability).
> However, since these observations are in more than one cohort, shouldn't
> the combined probability be greater for them (rather than smaller)?
> How should the combined inclusion probability be calculated?
> Or am I just wrong and the writer of the report is correct?
> Any references on dealing with overlapping "cohorts" would also be
> greatly appreciated.
> Rich
> *
> *   For searches and help try:
> *
> *
> *

Stas Kolenikov, also found at
Small print: I use this email account for mailing lists only.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index