[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sample adjustment by substitution instead of weighting

From	Dirk Enzmann <[email protected]>
To	[email protected]
Subject	Re: st: sample adjustment by substitution instead of weighting
Date	Sat, 25 Apr 2009 14:46:24 +0200

Thank you very much Steve for your elaborate answer - it is veryhelpful, indeed!


Dirk

On behalf of Steve I include his answer in reply to
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.0904/date/article-1134.html
here because at the moment he can't send mails to the list:

---------------------------------------------------------------------
Dirk-

I've never heard of this procedure. There is some basis for thinning asample randomly to meet sampling goals, and substitutions for missingobservations are also practiced, but you are not describing either of these.

The process of exclusion and duplication will destroy the ability of thesample to estimate anything but the characteristics that are beingmatched--but those are already known! For instance, the sample cannotestimate without bias the means of other variates. For the matchedcharacteristics, the sample will not permit estimation of SD's orquantiles. Moreover, no standard errors or confidence intervals can becomputed for anything, because the exclusions and duplication haveartificially reduced the variability in the sample.

To better match the sample estimates to known populationcharacteristics, I know of only three procedures: 1) post-stratification; 2) sample raking, which is an extension; and 3) generalized regressionestimation (GREG).

The exclusions and duplication are naive attempts to re-weight thesample. However they completely destroy it. So, no this is not actualpractice. The only discussion of something similar I've read is in Lohr(1999, Sampling: Design and Analysis, Duxbury, p 463) gives thereference to Neyman J. 1934. On the two different methods of therepresentative method: The method of stratified sampling and the methodof purposive selection. J. Royal Statistical Society 197: 558-606. Hereis the quote from her book:

"Neyman's paper pretty much finished off the idea that results frompurposive samples could be generalized to the population. He presentedan example of the purposive sample taken by Gini and Galvani in the late1920's. Gini and Galvani chose 29 districts that gave the averages ofall 214 districts in the 1921 Italian census, on a dozen variables. ButNeyman showed that all statistics other than the average values of thecontrols showed a violent contrast between the sample and the wholepopulation."

Of course, Gini and Galvani only excluded, but did not duplicate, theyonly excluded. So the procedure has long been discredited.


-Steve
---------------------------------------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Automatically generating variable names in mata
Next by Date: RE: st: imputing continuous values when respondents select categories, e.g., income category
Previous by thread: Re: st: sample adjustment by substitution instead of weighting
Next by thread: Re: AW: st: imputing continuous values when respondents select categories, e.g., income category
Index(es):
- Date
- Thread