Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: svy subpop option and e(sample) |

Date |
Wed, 25 May 2011 13:58:08 -0400 |

Richard- For a large enough subpopulation, the correct standard error for the ratio is indistinguishable from the standard error that assumes that the sample size was fixed (Lohr, 2009, p. 135, shows the formula for a SRS). Below is an example. (If you are using fpcs, then the sampling fractions for sub-population and full population must also be close.) So extract without guilt. Steve sjsamuels@gmail.com Ref: Lohr, Sharon L. 2009. Sampling: Design and Analysis. 2nd ed. Boston, MA: Cengage Brooks/Cole. *******CODE BEGINS***************** sysuse auto, clear set seed 31497 gen u= uniform() sort u expand 7 gen psu = mod(_n,50) replace mpg = mpg + 5*uniform() svyset psu [pweight=turn] svy: mean mpg if foreign==1 svy, subpop(foreign): mean mpg *****CODE ENDS******************** On May 25, 2011, at 11:10 AM, Richard Williams wrote: At 06:20 PM 5/24/2011, Steven Samuels wrote: > Just to elaborate: with sub-populations, the ratio estimator of a mean with every sample member in numerator and denominator is necessary because the sample size of the subpopulation is random, not fixed. This extends to the regression estimators, as they are functions of means. If you had use an -if- qualifier to restrict the analysis to black==1, e(sample) would work as you expect; the estimates would be the same; but the standard errors would be different. > > Steve > sjsamuels@gmail.com As a sidelight, one of the things that has always bothered me about subpop is that you are apparently never supposed to create an extract from your data, e.g. you could have 100 million cases and only be interested in a subpopulation of 10,000, but you are nonetheless supposed to keep all 100 million cases in your data set so the standard errors are right. I always wonder how horrible it would be if you just made the extract or used -if- instead of subpop. If, say, the standard errors might be off by .01%, I suspect I could live with that. ------------------------------------------- Richard Williams, Notre Dame Dept of Sociology OFFICE: (574)631-6668, (574)631-6463 HOME: (574)289-5227 EMAIL: Richard.A.Williams.5@ND.Edu WWW: http://www.nd.edu/~rwilliam * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: svy subpop option and e(sample)***From:*Hitesh Chandwani <hchandwani.stata@gmail.com>

**References**:**st: svy subpop option and e(sample)***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: svy subpop option and e(sample)***From:*Steven Samuels <sjsamuels@gmail.com>

**Re: st: svy subpop option and e(sample)***From:*Richard Williams <richardwilliams.ndu@gmail.com>

- Prev by Date:
**Re: st: reshaping long panel into wide to get rowtotals** - Next by Date:
**Re: st: Stata crashes when loading a dataset** - Previous by thread:
**Re: st: svy subpop option and e(sample)** - Next by thread:
**Re: st: svy subpop option and e(sample)** - Index(es):