Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Resampling and compare full sample with subsamples


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Resampling and compare full sample with subsamples
Date   Fri, 7 Mar 2014 18:19:59 -0500

Ah, you left out most of the detail; your explanation makes sense. To
answer your original question. You want to compare a part A to a whole C
But C = A U B, where B is the observations in C that are not in A. Let
pA and pB be the prevalnce rates in A and B and pW be the prevalence in
the whole. Then if nA, nB, and n are the sample sizes of A,B, and

(*) pC = W pA + (1 - W) pB where W = nA/n.

(**) pC - pA = (1-W)(pB - pA).

A one-sample test comparing A to C is not correct, C is itself a random
sample and pC and pA are correlated. as A is a SRS random sample of C
without replacement, B is also a SRS, pA and pB are slightly negatively
correlated becaus because of (*)

If pA and pB are different, then pA and pC are different (and
vice-versa). Looking at (**) you can see that the proper test is a
*two-sample* test that compares pA and pB. The standard error is
computed under the null hypothesis, and without a finite population
correction. (Cochran, 1977, problem 2.16, p. 48). I myself think that
confidence intervals are preferable to hypothesis tests here.

Reference: Cochran, W. G. (1977). Sampling techniques (3rd ed.). New
York: Wiley.


Steve
[email protected]

> On Mar 7, 2014, at 5:10 AM, Johannes Thrul <[email protected]> wrote:
> 
> Steve,
> Thank you for your suggestions - I will look into them. The logic behind the approach is that there are several waves in said cross sectional survey and the response rates among schools have been dropping. So I am going to use a wave where the response rate was still good, reduce the sample size randomly and based on certain school characteristics and examine the resulting prevalence rates. This should give me an idea of what loosing certain kinds of schools means for the reliability of prevalence figures in other survey waves.
> So what I have come up with is a loop that draws random samples of a certain size from the original sample, I then calculate a mean over these random samples and compare it to the value of the original sample using a one sample test of proportions.
> Does this sound like a reasonable approach to you?
> Thanks and kind regards, Johannes
> 
-----Ursprüngliche Nachricht-----
Von: [email protected] [mailto:[email protected]] Im Auftrag von Steve Samuels
Gesendet: Dienstag, 4. März 2014 21:38
An: [email protected]
Betreff: Re: st: Resampling and compare full sample with subsamples

Johannes-

I don't get the logic of your approach. Assuming that prevalence rates of responders and non-responders differ, i.e. show "response bias", a comparison of random samples of responders to all responders will provide *no* information on the degree of bias.

There are accepted reweighting techniques for evaluating and reducing response bias. See the  downloadable references below.

If, in fact, you know descriptive statistics for the entire population of schools and for the responding schools, you can make the responding schools more closely resemble not just the sample, but the population.
See Stas Kolenikov's -ipfraking- (-findit-) and John D'Souza's
-calibrate- (SSC).


References:

Burns, Shelley, Xiaolei Wang, and Alexandra Henning. 2011. NCES Handbook of Survey Methods. NCES 2011-609. National Center for Education Statistics http://eric.ed.gov/?id=ED521154

Carlson, BL, and Williams, S. 2001. A comparison of two methods to adjust weights for non-response: propensity modeling and weighting class adjustments. Proceedings of the Annual Meeting of the American Statistical Association http://www.amstat.org/sections/SRMS/proceedings/y2001/Proceed/00111.pdf

Kreuter, Frauke, Kristen Olson, James Wagner, Ting Yan, Trena M Ezzati-Rice, Carolina Casas-Cordero, Michael Lemay, Andy Peytchev, Robert M Groves, and Trivellore E Raghunathan. 2010. Using proxy measures and other correlates of survey outcomes to adjust for
non,Äêresponse: examples from multiple surveys. Journal of the Royal Statistical Society: Series A (Statistics in Society) 173, no. 2:
389-407.
http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1139&context=
sociologyfacpub


Little, RJ, and S Vartivarian. 2003. On weighting the rates in non-response weights. Stat Med 22, no. 9: 1589-1599, available at http://deepblue.lib.umich.edu/bitstream/2027.42/34860/1/1513_ftp.pdf


Wun, L-M, and Ezzati-Rice, T. 2007. Assessment of the impact of health variables on nonresponse adjustment in the medical expenditure panel survey (MEPS). Proc. Surv. Res. Meth. Sect. Am. Statist. Ass 2857-2864.
http://www.amstat.org/sections/SRMS/Proceedings/y2007/Files/JSM2007-000336.pdf

Steve


Steve Samuels
Consultant in Statistics
18 Cantine's Island
Saugerties NY 12477 USA
845-246-0774


> On Mar 3, 2014, at 10:58 AM, Johannes Thrul <[email protected]> wrote:
> 
> Dear list,
> I am working on a large survey dataset (12,000 individuals clustered in 600 schools) and want to examine, how non-response/non-participation of schools affects prevalence estimates (e.g., alcohol use). My plan is to reduce the sample of schools randomly and systematically (e.g., only exclude large schools) and compare the resulting estimates from the subsamples with the estimates from the full sample. I thought of an approach like this: Reduce the sample size in 10% increments, draw a number of subsamples at every step and compare the estimates. However, I have 2 questions about how to best approach this in Stata: 
> 1. Drawing subsamples: Should I use a jackknife, bootstrap, or even something entirely different for drawing the subsamples?
> 2. Testing: How should I go about testing the results from the subsamples against the full sample?
> Any help is greatly appreciated!
> Thanks and kind regards, Johannes
> 
> --
> Dr. Johannes Thrul, Dipl.-Psych.
> 
> Wissenschaftlicher Mitarbeiter / Researcher Präventionsforschung / 
> Prevention Research
> 
> IFT Institut für Therapieforschung / Parzivalstr. 25, D-80804 München 
> / www.ift.de phone +49 (0) 89 360804 86 / fax +49 (0) 89 360804 69 / 
> e-mail [email protected]
> 
> IFT Institut für Therapieforschung gem. Gesellschaft mbH / 
> Registergericht München HRB 46395 Geschäftsführung: Prof. Dr. Gerhard 
> Bühringer
> 
> 
*
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index