Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy subpop option and e(sample)


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: svy subpop option and e(sample)
Date   Fri, 27 May 2011 12:13:16 -0400

Richard--
I claimed in http://www.stata.com/statalist/archive/2007-11/msg00810.html
that "It is tempting to write a -svysubset- package
to automate this subsetting procedure, but for any given model, the
pattern of missing values might be different, which means the
automatic-subsetting package could offer no savings in general over
keeping all the data in memory."  Maybe a bit strong, but the general point is
that the ad hoc solution is not straightforward to generalize in the presence
of missing data.

On Fri, May 27, 2011 at 12:25 PM, Richard Williams
<richardwilliams.ndu@gmail.com> wrote:
> At 10:08 AM 5/27/2011, Steven Samuels wrote:
>>
>> Hitesh
>>
>> After reading  Section 5.4 of Korn and Graubard (1999), I return to Stas's
>> advice: you need a good reason not to do the correct analysis. Here lack of
>> memory won't be a reason,  for,as you have apparently surmised, you don't
>> need to load the entire original data set. Instead create _one_ dummy
>> observation for each PSU that contains no members of the sub-population. For
>> this observation, set the value of all the analysis variables to zero or to
>> some other convenient value.
>
> Interesting. Would it be fairly straightforward to create an -svyextract-
> command then? It seems like such a command could be quite useful for those
> who would otherwise have to deal with massive data sets. Maybe even add a
> property to the svysettings so the dof would be right when analyzing the
> extract. This might be a good wish list item for Stata 12.
>
>> There is one more thing to do: in the -svyset- statement, use the -dof()-
>> option to set the degrees of freedom to: number of PSUs with members of the
>> subpopulation minus number of  strata with observations in the
>> sub-population (Korn & Graubard, 1999, p. 209).
>>
>> Ref: Korn, Edward Lee, and Barry I Graubard. 1999. Analysis of Health
>> Surveys. New York: Wiley.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index