[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Margo Schlanger <margo.schlanger@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Dealing with survey data when the entire population is also in the dataset |

Date |
Fri, 24 Jul 2009 19:06:33 -0400 |

Hi -- I have a dataset in which the observation is a "case". I started with a complete census of the ~4000 relevant cases; each of them gets a line in my dataset. I have data filling a few variables about each of them. (When they were filed, where they were filed, the type of outcome, etc.) I randomly sampled them using 3 strata (for one strata, the sampling probability was 1, for another about .5, and for a third, about .75). I end up with a sample of about 2000. I know much more about this sample. Ok, my question: 1) How do I use the svyset command to describe this dataset? It would be easy if I just dropped all the non-sampled observations, but I don't want to do that, because of question 2: 2) How do I compare something about the sample to the entire population, just to demonstrate that my sample isn't very different from that entire population on any of the few variables I actually have comprehensive data about. I could do this simply, if I didn't have to worry about weighting: tabulate year sample, chi2 But I need the weights. In addition, I can't simply use weighting commands, because in the population (when sample == 0), everything should be weighted the same; the weights apply only to my sample (when sample == 1). And I can't (so far) use survey commands, because I don't know the answer to (1), above. NOTE: Nearly all the variables I care about are categorical: year of filing, type of case. But it's easy enough to turn them into dummies, if that's useful. Thanks for any help with this. Margo Schlanger ______________________ Professor of Law University of Michigan Law School Director, Civil Rights Litigation Clearinghouse (http://clearinghouse.wustl.edu) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Dealing with survey data when the entire population is also in the dataset***From:*"Michael I. Lichter" <MLichter@Buffalo.EDU>

**st: Factor models***From:*kokootchke <kokootchke@hotmail.com>

- Prev by Date:
**Re: st: AW: For each fund-asset pair I have observations on some periods, and want have them on each possible period...** - Next by Date:
**st: Factor models** - Previous by thread:
**st: Creating a loop for bsample** - Next by thread:
**st: Factor models** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |