Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population)

From	Ekaterina Hertog <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population)
Date	Sun, 07 Aug 2011 13:05:56 +0400

Dear Steven,

thank you for your help, however it does not fully solve my problem.Your proposed solution will allow me to roughly preserve the populationpercentages from the whole sample into a subsample. What I need however,is to impose populations percentages found in a different dataset on asubsample I am creating. Essentially i have two datasets: one of highincome women and one of middle income women. High income women tend tobe older and are more likely to live in the capital. I need to create asubsample of a dataset of middle income woemn which would match the highincome women dataset on age and location characteristics.

Does anyone know how to do this in Stata 11?
Ekaterina

On 07/08/2011 09:08, Steven Samuels wrote:

The following code shows how to take a 10% sample within categories formed by two variables. The sample and whole population percentages will be approximately the same, with the agreement better for larger within-cell sample sizes.

Steve

*************CODE BEGINS*************
sysuse auto, clear
expand 6
set seed 842655
recode rep78 1/2=5 .=5
tab rep78 foreign, cell
sample 10, by(foreign rep78)
tab rep78 foreign, cell
**************CODE ENDS**************



On Aug 6, 2011, at 4:23 PM, Ekaterina Hertog wrote:

Dear all,
I need to take a subsample of observations from a big dataset making sure that the people in the subsample have a given geographic and age profile. I need to make sure that, say, 50% of people in the subsample come from the capital and 50% from other towns. Within each of these 2 locations I want to preserve a certain age structure: say in a city: 3 people ages 23, 4 people aged 24 …
Within those geographic and age profiles I want to select the observations randomly. Is it possible to do that in Stata 11? Any thoughts on how I would go about it?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population)
  - From: Steven Samuels <[email protected]>

References:
- st: semi-random sampling
  - From: Ekaterina Hertog <[email protected]>
- Re: st: semi-random sampling
  - From: Steven Samuels <[email protected]>

Prev by Date: Re: st: RE: extract rownames corresponding to data as unique codes
Next by Date: Re: st: Predicting survival at some specific times
Previous by thread: Re: st: semi-random sampling
Next by thread: Re: st: semi-random sampling (how to impose properties of one population onto a subsample of a different population)
Index(es):
- Date
- Thread