Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population) |

Date |
Sun, 7 Aug 2011 21:35:54 -0400 |

Ekaterina Hertog-- Do you need a sample? You could just reweight using a propensity score. I.e. is this for analysis or for surveying the sampled obs? On Sun, Aug 7, 2011 at 10:32 AM, Steven Samuels <sjsamuels@gmail.com> wrote: > > > Sorry, I misunderstood. Here's code that you can adapt. Note that you set the sample size you want in the first line > > *************CODE BEGINS************* > **************CODE ENDS************** > > On Aug 7, 2011, at 5:05 AM, Ekaterina Hertog wrote: > > Dear Steven, > thank you for your help, however it does not fully solve my problem. Your proposed solution will allow me to roughly preserve the population percentages from the whole sample into a subsample. What I need however, is to impose populations percentages found in a different dataset on a subsample I am creating. Essentially i have two datasets: one of high income women and one of middle income women. High income women tend to be older and are more likely to live in the capital. I need to create a subsample of a dataset of middle income woemn which would match the high income women dataset on age and location characteristics. > Does anyone know how to do this in Stata 11? > Ekaterina > > On 07/08/2011 09:08, Steven Samuels wrote: >> The following code shows how to take a 10% sample within categories formed by two variables. The sample and whole population percentages will be approximately the same, with the agreement better for larger within-cell sample sizes. >> >> Steve >> >> *************CODE BEGINS************* >> sysuse auto, clear >> expand 6 >> set seed 842655 >> recode rep78 1/2=5 .=5 >> tab rep78 foreign, cell >> sample 10, by(foreign rep78) >> tab rep78 foreign, cell >> **************CODE ENDS************** >> >> >> >> On Aug 6, 2011, at 4:23 PM, Ekaterina Hertog wrote: >> >> Dear all, >> I need to take a subsample of observations from a big dataset making sure that the people in the subsample have a given geographic and age profile. I need to make sure that, say, 50% of people in the subsample come from the capital and 50% from other towns. Within each of these 2 locations I want to preserve a certain age structure: say in a city: 3 people ages 23, 4 people aged 24 … >> Within those geographic and age profiles I want to select the observations randomly. Is it possible to do that in Stata 11? Any thoughts on how I would go about it? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: semi-random sampling***From:*Ekaterina Hertog <ekaterina.hertog@sociology.ox.ac.uk>

**Re: st: semi-random sampling***From:*Steven Samuels <sjsamuels@gmail.com>

**Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population)***From:*Ekaterina Hertog <ekaterina.hertog@sociology.ox.ac.uk>

**Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population)***From:*Steven Samuels <sjsamuels@gmail.com>

- Prev by Date:
**Re: st: date in stata** - Next by Date:
**Re: st: xtabond2 and consistency using twostep robust** - Previous by thread:
- Next by thread:
- Index(es):