Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy subpop option and e(sample)

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: svy subpop option and e(sample)
Date	Fri, 27 May 2011 15:51:34 -0400

Steve--
Just do MI on the complete dataset, wind up with 20 times as much
data, or 100 times as much, then sample from that larger dataset.
OK, maybe that does not help make the dataset small and manageable
after all...

It *is* feasible to use 30-some GB of data, though maybe not on one's
laptop--time to find a better computer!

On Fri, May 27, 2011 at 3:37 PM, Steven Samuels <[email protected]> wrote:
>
> Austin--
>
> Like Richard, I forgot about your post and about the need to pool singleton strata. Your "better" estimation procedure is a complete solution.
>
> In Hitesh's case, keeping all data in memory isn't feasible. For dealing with missing data, what do you think about MI restricted to the subpopulation?
>
> Steve
> [email protected]
>
> On May 27, 2011, at 12:13 PM, Austin Nichols wrote:
>
> Richard--
> I claimed in http://www.stata.com/statalist/archive/2007-11/msg00810.html
> that "It is tempting to write a -svysubset- package
> to automate this subsetting procedure, but for any given model, the
> pattern of missing values might be different, which means the
> automatic-subsetting package could offer no savings in general over
> keeping all the data in memory."  Maybe a bit strong, but the general point is
> that the ad hoc solution is not straightforward to generalize in the presence
> of missing data.
>
> On Fri, May 27, 2011 at 12:25 PM, Richard Williams
> <[email protected]> wrote:
>> At 10:08 AM 5/27/2011, Steven Samuels wrote:
>>>
>>> Hitesh
>>>
>>> After reading  Section 5.4 of Korn and Graubard (1999), I return to Stas's
>>> advice: you need a good reason not to do the correct analysis. Here lack of
>>> memory won't be a reason,  for,as you have apparently surmised, you don't
>>> need to load the entire original data set. Instead create _one_ dummy
>>> observation for each PSU that contains no members of the sub-population. For
>>> this observation, set the value of all the analysis variables to zero or to
>>> some other convenient value.
>>
>> Interesting. Would it be fairly straightforward to create an -svyextract-
>> command then? It seems like such a command could be quite useful for those
>> who would otherwise have to deal with massive data sets. Maybe even add a
>> property to the svysettings so the dof would be right when analyzing the
>> extract. This might be a good wish list item for Stata 12.
>>
>>> There is one more thing to do: in the -svyset- statement, use the -dof()-
>>> option to set the degrees of freedom to: number of PSUs with members of the
>>> subpopulation minus number of  strata with observations in the
>>> sub-population (Korn & Graubard, 1999, p. 209).
>>>
>>> Ref: Korn, Edward Lee, and Barry I Graubard. 1999. Analysis of Health
>>> Surveys. New York: Wiley.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: svy subpop option and e(sample)
  - From: Richard Williams <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Steven Samuels <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Richard Williams <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Steven Samuels <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Hitesh Chandwani <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Steven Samuels <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Hitesh Chandwani <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Steven Samuels <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Richard Williams <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Austin Nichols <[email protected]>
- Re: st: svy subpop option and e(sample)
  - From: Steven Samuels <[email protected]>

Prev by Date: Re: st: Post-estimation predicted probability and predicted probability calculated by hand do not match after running cloglog
Next by Date: Re: st: svy subpop option and e(sample)
Previous by thread: Re: st: svy subpop option and e(sample)
Next by thread: Re: st: svy subpop option and e(sample)
Index(es):
- Date
- Thread