Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: stratify question


From   <Sim.Oertel@t-online.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: stratify question
Date   Fri, 12 Sep 2008 06:13:19 +0200

Dear Stata Users,

I have a quick question. I am working with two datasets. The first one
(dataset "A") contains 3,500 cases, the second one (dataset "B") contains
150,000 cases. 

In dataset "A" variable "x" equals 1, in dataset "B" 0. For a pre-study, I
would like to analyze how variable "x" influences the survival chances of a
subject. Each dataset contains further variables, while the distribution of
the values of these variables differs between them. For example, welfare can
take values between 0 and 10. In sample "B" welfare is generally higher than
in sample "A". Since I am not interested in the effect of welfare, I would
like to draw a stratified sample of dataset "B" containing subjects with the
similar quota of welfare than the subjects included in sample "A". The
datasets contain many other variables. Therefore, I would like to enlarge
this process on more than one variable.

Put differently, I would like to tell Stata: "Generate a sample "C" based on
the dataset "B" which holds the same proportion of variable values for
"welfare", "age", "location", . as sample "A"." 

Finally, I would combine the stratified sample of "B" and the full sample of
"A" and run a logit model.   

Is there a simple way of doing this in Stata? How can I generate a
stratified sample with Stata?

Thanks for your help
Simon

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index