Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: svy subpop option and e(sample)

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: svy subpop option and e(sample) Date Wed, 25 May 2011 13:58:08 -0400

```Richard-
For a large enough subpopulation, the correct standard error for the ratio is indistinguishable from the standard error that assumes that the sample size was fixed (Lohr, 2009, p. 135, shows the formula for a SRS). Below is an example. (If you are using fpcs, then the sampling fractions for sub-population and full population must also be close.)  So extract without guilt.

Steve
sjsamuels@gmail.com

Ref: Lohr, Sharon L. 2009. Sampling: Design and Analysis. 2nd ed. Boston, MA: Cengage Brooks/Cole.
*******CODE BEGINS*****************
sysuse auto, clear
set seed 31497
gen u= uniform()
sort u
expand 7
gen psu = mod(_n,50)
replace mpg = mpg + 5*uniform()
svyset psu [pweight=turn]
svy: mean mpg if foreign==1
svy, subpop(foreign): mean mpg
*****CODE ENDS********************

On May 25, 2011, at 11:10 AM, Richard Williams wrote:

At 06:20 PM 5/24/2011, Steven Samuels wrote:
> Just to elaborate: with sub-populations, the ratio estimator of a mean with every sample member in numerator and denominator is necessary because the sample size of the subpopulation is random, not fixed. This extends to the regression estimators, as they are functions of means.  If you had use an -if- qualifier to restrict the analysis to black==1, e(sample) would work as you expect; the estimates would be the same; but the standard errors would be different.
>
> Steve
> sjsamuels@gmail.com

As a sidelight, one of the things that has always bothered me about subpop is that you are apparently never supposed to create an extract from your data, e.g. you could have 100 million cases and only be interested in a subpopulation of 10,000, but you are nonetheless supposed to keep all 100 million cases in your data set so the standard errors are right. I always wonder how horrible it would be if you just made the extract or used -if- instead of subpop. If, say, the standard errors might be off by .01%, I suspect I could live with that.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```