Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: svy subpop option and e(sample) |

Date |
Thu, 26 May 2011 22:54:19 -0400 |

Hitesh- I should also have asked the nature of the subpopulation--whether it was defined by characteristics of the PSUs (e.g. region), so that a PSU was either in or out, or by characteristics of observations within PSUs, so that a PSU could contain people in and people out of the subpopulation. If the former, is the subpopulation one of the original sampling strata? Steve Hitesh, The relevant number would be the number of PSUs. If that is 300,000, I would think that it's much more than enough. If you don't mind my asking, what kind of sample had 75 million observations? I usually encounter numbers like that only in census data. Steve sjsamuels@gmail.com Steve, You said in an earlier message: For a large enough subpopulation, the correct standard error for the ratio is indistinguishable from the standard error that assumes that the sample size was fixed (Lohr, 2009, p. 135, shows the formula for a SRS). How large is large enough? I am facing a similar problem. I extracted my subpopulation of interest and have 300,000 observations. My original data had 75 million observations with 61 variables. I cannot use the entire data due to insufficient RAM on my computer (I will need about 30-odd GB of RAM to analyze the data as a whole). I had to ask someone with access to such a powerful machine to extract the data for me. If the standard errors for data this large are not going to be very biased, I can report the variance estimation issue as a limitation of the analysis. If the data are not large enough, then I will need to compute dummy variables for all PSUs not represented in the extracted data. I would appreciate any help on the matter. Regards, -- Hitesh S. Chandwani University of Texas at Austin * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: svy subpop option and e(sample)***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: svy subpop option and e(sample)***From:*Steven Samuels <sjsamuels@gmail.com>

**Re: st: svy subpop option and e(sample)***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: svy subpop option and e(sample)***From:*Steven Samuels <sjsamuels@gmail.com>

**Re: st: svy subpop option and e(sample)***From:*Hitesh Chandwani <hchandwani.stata@gmail.com>

- Prev by Date:
**Re: st: svy subpop option and e(sample)** - Next by Date:
**Re: st: date conversion** - Previous by thread:
**Re: st: svy subpop option and e(sample)** - Next by thread:
**Re: st: svy subpop option and e(sample)** - Index(es):