Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy subpop option and e(sample)

From   Richard Williams <>
Subject   Re: st: svy subpop option and e(sample)
Date   Fri, 27 May 2011 16:00:53 -0500

Thanks Austin. I remembered you said it wasn't straightforward, but I was kind of hoping you were wrong. :) I'm surprised at how difficult this is. If Stata can't work out a nice simple solution, it might be nice to at least include more text somewhere explicitly warning about the problems with extracting.

At 11:13 AM 5/27/2011, Austin Nichols wrote:
I claimed in
that "It is tempting to write a -svysubset- package
to automate this subsetting procedure, but for any given model, the
pattern of missing values might be different, which means the
automatic-subsetting package could offer no savings in general over
keeping all the data in memory."  Maybe a bit strong, but the general point is
that the ad hoc solution is not straightforward to generalize in the presence
of missing data.

On Fri, May 27, 2011 at 12:25 PM, Richard Williams
<> wrote:
> At 10:08 AM 5/27/2011, Steven Samuels wrote:
>> Hitesh
>> After reading  Section 5.4 of Korn and Graubard (1999), I return to Stas's
>> advice: you need a good reason not to do the correct analysis. Here lack of
>> memory won't be a reason,  for,as you have apparently surmised, you don't
>> need to load the entire original data set. Instead create _one_ dummy
>> observation for each PSU that contains no members of the sub-population. For >> this observation, set the value of all the analysis variables to zero or to
>> some other convenient value.
> Interesting. Would it be fairly straightforward to create an -svyextract-
> command then? It seems like such a command could be quite useful for those
> who would otherwise have to deal with massive data sets. Maybe even add a
> property to the svysettings so the dof would be right when analyzing the
> extract. This might be a good wish list item for Stata 12.
>> There is one more thing to do: in the -svyset- statement, use the -dof()-
>> option to set the degrees of freedom to: number of PSUs with members of the
>> subpopulation minus number of  strata with observations in the
>> sub-population (Korn & Graubard, 1999, p. 209).
>> Ref: Korn, Edward Lee, and Barry I Graubard. 1999. Analysis of Health
>> Surveys. New York: Wiley.

*   For searches and help try:

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index