Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sampling query

From	Richard Goldstein <[email protected]>
To	[email protected]
Subject	Re: st: sampling query
Date	Thu, 27 Jan 2011 17:00:55 -0500

Steve,

thank you (and, yes, I have a copy of Kish)

Rich

On 1/27/11 4:54 PM, Steven Samuels wrote:
> 
> Rich,
> 
> You must compute these probabilities for every member of the combined
> sample, not just those selected in 2+ cohorts. If possible, your reading
> should include Section 11.2 "Duplicate Listings; Overlapping Frames" of
> Leslie Kish, Survey Sampling, Wiley, 1965.
> 
> Steve
> 
> 
> Rich,
> 
> Look up "multiple frames". That's a more common term for samples in
> which the ultimate unit can be reached through different trajectories
> (say landline phone sample, cell phone sample, and area/personal visit
> sample). The probabilities should be combined as
> 
> 1 - Prob[ in the sample ] = product over k of (1-Prob[ reach the unit
> through the k-th frame ] )
> 
> which for small probabilities leads to sum of selection probabilities.
> You are totally right that the probability should go up rather than
> down.
> 
> On Thu, Jan 27, 2011 at 10:37 AM, Richard Goldstein
> <[email protected]> wrote:
>> all,
>>
>> I have received a report in which the report writer was stuck with the
>> following design (already implemented before his involvement): a number
>> of "cohorts" were set up (22 of them in fact) and the definitions of
>> these cohorts were not mutually exclusive (i.e., there was some overlap
>> in membership so that a given observation could appear in more than 1
>> cohort); to calculate the probability weights, the report writer first
>> calculated the probability of inclusion for each cohort (simply as n/N
>> where n is sample size from cohort and N is population size of cohort).
>>
>> For observations in more than one cohort, who were actually selected, he
>> then multiplied the inclusion probabilities of each cohort that
>> observations was in. Since each inclusion probability is less than 1,
>> the combined inclusion probability is smaller than the individual
>> inclusion probabilities for the individual cohort. And then, of course,
>> the weights are greater for these people (since the weight is just the
>> inverse of the inclusion probability).
>>
>> However, since these observations are in more than one cohort, shouldn't
>> the combined probability be greater for them (rather than smaller)?
>>
>> How should the combined inclusion probability be calculated?
>>
>> Or am I just wrong and the writer of the report is correct?
>>
>> Any references on dealing with overlapping "cohorts" would also be
>> greatly appreciated.
>>
>> Rich
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: sampling query
  - From: Richard Goldstein <[email protected]>
- Re: st: sampling query
  - From: Stas Kolenikov <[email protected]>
- Re: st: sampling query
  - From: Steven Samuels <[email protected]>

Prev by Date: Re: st: sampling query
Next by Date: Antwort: st: Heckprob error: outcome = y_select > 0 predicts data perfectly
Previous by thread: Re: st: sampling query
Next by thread: st: Q1, Median & Q3 for pweighted svy data.
Index(es):
- Date
- Thread