[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: seeking answer to survey set question |

Date |
Fri, 28 Aug 2009 15:14:54 -0500 |

Very simply put, the survey characteristic that provides the match of the sample to your population are sampling weights. Strata and PSUs are used to obtain correct standard errors; if stratification is ignored (as Austin implicitly suggests below), you will get conservative standard errors (i.e., too large). You should try to find out from documentation as to why those 1406 observations have missing design information. Were they not sampled originally, but added as a substitution? Is this a part of a replenishment sample? Are those the students who transferred to another school, and hence aren't a part of the original PSU in the later waves? What is the story behind them? On Fri, Aug 28, 2009 at 11:04 AM, Austin Nichols<austinnichols@gmail.com> wrote: > James <jpsanders@wsu.edu> : > Here's what I would do: > > egen c=group(stata psu), m > ologit depvar indvar1 indvar2 [pw=pw], cluster(c). > > which puts all the missing-strata people in one stratum. > > On Fri, Aug 28, 2009 at 11:59 AM, Sanders, James Parry<jpsanders@wsu.edu> wrote: >> Hello, >> NELS (educational) data comes packaged with psu, pw, and strata data. When I svyset the data, I am told that 1,406 cases have missing values in the survey characteristics (all 1,406 are missing psu and strata data). Thus, when I run a survey command (e.g. svy: ologit) these 1,406 are excluded from the analysis. Alternatively, I can keep the 1,406 in by running a standard command and including 2 of the three weights but leaving out the strata values (e.g. ologit depvar indvar1 indvar2 [pw=pw], cluster(psu)). Either way the results are essentially the same. >> >> My question(s) is/are this: Which way is preferred? The first way includes all weights but drops 8% of the sample who otherwise have complete data. The second way keeps everyone but doesn't include the strata data and may thus not be fully representative of the population. Is there a way to keep cases lacking psu and strata data in svy commands. Alternatively, is there a way to include the strata data in the non-survey command? >> Thanks for any help, >> James -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: seeking answer to survey set question***From:*"Sanders, James Parry" <jpsanders@wsu.edu>

**Re: st: seeking answer to survey set question***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**Re: st: "Conformability Error" when running svy:tabulate using over() option** - Next by Date:
**st: Most recent update to Stata 11** - Previous by thread:
**Re: st: seeking answer to survey set question** - Next by thread:
**st: Strange behavior or bug? - svy : total returns different results in Stata versons 9, 10 and 11** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |