Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy subpop option and e(sample)

From   Steven Samuels <>
Subject   Re: st: svy subpop option and e(sample)
Date   Fri, 27 May 2011 15:37:55 -0400


Like Richard, I forgot about your post and about the need to pool singleton strata. Your "better" estimation procedure is a complete solution. 

In Hitesh's case, keeping all data in memory isn't feasible. For dealing with missing data, what do you think about MI restricted to the subpopulation? 


On May 27, 2011, at 12:13 PM, Austin Nichols wrote:

I claimed in
that "It is tempting to write a -svysubset- package
to automate this subsetting procedure, but for any given model, the
pattern of missing values might be different, which means the
automatic-subsetting package could offer no savings in general over
keeping all the data in memory."  Maybe a bit strong, but the general point is
that the ad hoc solution is not straightforward to generalize in the presence
of missing data.

On Fri, May 27, 2011 at 12:25 PM, Richard Williams
<> wrote:
> At 10:08 AM 5/27/2011, Steven Samuels wrote:
>> Hitesh
>> After reading  Section 5.4 of Korn and Graubard (1999), I return to Stas's
>> advice: you need a good reason not to do the correct analysis. Here lack of
>> memory won't be a reason,  for,as you have apparently surmised, you don't
>> need to load the entire original data set. Instead create _one_ dummy
>> observation for each PSU that contains no members of the sub-population. For
>> this observation, set the value of all the analysis variables to zero or to
>> some other convenient value.
> Interesting. Would it be fairly straightforward to create an -svyextract-
> command then? It seems like such a command could be quite useful for those
> who would otherwise have to deal with massive data sets. Maybe even add a
> property to the svysettings so the dof would be right when analyzing the
> extract. This might be a good wish list item for Stata 12.
>> There is one more thing to do: in the -svyset- statement, use the -dof()-
>> option to set the degrees of freedom to: number of PSUs with members of the
>> subpopulation minus number of  strata with observations in the
>> sub-population (Korn & Graubard, 1999, p. 209).
>> Ref: Korn, Edward Lee, and Barry I Graubard. 1999. Analysis of Health
>> Surveys. New York: Wiley.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index