Dear Stata Users, I have a quick question. I am working with two datasets. The first one (dataset "A") contains 3,500 cases, the second one (dataset "B") contains 150,000 cases. In dataset "A" variable "x" equals 1, in dataset "B" 0. For a pre-study, I would like to analyze how variable "x" influences the survival chances of a subject. Each dataset contains further variables, while the distribution of the values of these variables differs between them. For example, welfare can take values between 0 and 10. In sample "B" welfare is generally higher than in sample "A". Since I am not interested in the effect of welfare, I would like to draw a stratified sample of dataset "B" containing subjects with the similar quota of welfare than the subjects included in sample "A". The datasets contain many other variables. Therefore, I would like to enlarge this process on more than one variable. Put differently, I would like to tell Stata: "Generate a sample "C" based on the dataset "B" which holds the same proportion of variable values for "welfare", "age", "location", . as sample "A"." Finally, I would combine the stratified sample of "B" and the full sample of "A" and run a logit model. Is there a simple way of doing this in Stata? How can I generate a stratified sample with Stata? Thanks for your help Simon * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

