Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy subpop option and e(sample)

From   Steven Samuels <>
Subject   Re: st: svy subpop option and e(sample)
Date   Wed, 25 May 2011 13:58:08 -0400

For a large enough subpopulation, the correct standard error for the ratio is indistinguishable from the standard error that assumes that the sample size was fixed (Lohr, 2009, p. 135, shows the formula for a SRS). Below is an example. (If you are using fpcs, then the sampling fractions for sub-population and full population must also be close.)  So extract without guilt.


Ref: Lohr, Sharon L. 2009. Sampling: Design and Analysis. 2nd ed. Boston, MA: Cengage Brooks/Cole.
*******CODE BEGINS*****************
sysuse auto, clear
set seed 31497
gen u= uniform()
sort u
expand 7
gen psu = mod(_n,50)
replace mpg = mpg + 5*uniform()
svyset psu [pweight=turn]
svy: mean mpg if foreign==1
svy, subpop(foreign): mean mpg
*****CODE ENDS********************

On May 25, 2011, at 11:10 AM, Richard Williams wrote:

At 06:20 PM 5/24/2011, Steven Samuels wrote:
> Just to elaborate: with sub-populations, the ratio estimator of a mean with every sample member in numerator and denominator is necessary because the sample size of the subpopulation is random, not fixed. This extends to the regression estimators, as they are functions of means.  If you had use an -if- qualifier to restrict the analysis to black==1, e(sample) would work as you expect; the estimates would be the same; but the standard errors would be different.
> Steve

As a sidelight, one of the things that has always bothered me about subpop is that you are apparently never supposed to create an extract from your data, e.g. you could have 100 million cases and only be interested in a subpopulation of 10,000, but you are nonetheless supposed to keep all 100 million cases in your data set so the standard errors are right. I always wonder how horrible it would be if you just made the extract or used -if- instead of subpop. If, say, the standard errors might be off by .01%, I suspect I could live with that.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index