Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Longitudinal sampling (many waves)

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Longitudinal sampling (many waves)
Date	Mon, 19 Mar 2012 16:33:42 -0400

Oops! 1 in 200 is likely not to pick anything in the auto data set. Try 1 in 5 instead:
***********
samplesys 5
***********

In case there are few late-entrants to the panel, a simple random sample in each time period after initial enrollment might not feasible. If that's so, order new entrants by strata (optionally) and entry date; then take a systematic sample.

***************CODE BEGINS*******************
capture program drop _all

program define samplesys, rclass
/* Draw 1 in k systematic sample: syntax "samplesys k" */
syntax anything [if ] [in]
args k
marksample touse
confirm integer number `k'
tempname start
scalar `start' = ceil(`k'*runiform())
keep if mod(_n-`start',`k')==0 & `touse'
return scalar start = `start'
end

sysuse auto, clear
gen entry_date = price

sort foreign entry_date
set seed 833332
samplesys 200

***************CODE ENDS*******************

Steve

Laurie Molina:

You asked if there is a command for longitudinal sampling. Brendan's suggestion is valid. Randomly sample units who enter in the same time period and follow them for as long as they remain in the database. There will be no distinction between the sample in T+10 and the one in T+11.

Sample with fixed probability of 1 in k to assure equal weights for the cross-sectional analyses. Start a dataset containing for each individual their study ID and entry month. To take, e.g., a 1 in 200 sample:

**************************
set seed [YOUR CHOICE]
sample 0.5 , by(entry_period)
**************************

You can add important strata to the by() clause.

For longitudinal analyses, no simple sampling plan will compensate for attrition. You would have to take in each period a new sample of continuing units who "resemble" those lost in the period with respect to entry date and other characteristics. Re-weighting the continuing sample members is preferable, I think; propensity score and weighting class approaches are both popular.

Steve
[email protected]

This is essentially the same question that you posted on February 22. You've either missed or ignored Brendan Halpin's excellent response at http://www.stata.com/statalist/archive/2012-02/msg01033.html.

Steve
[email protected]

On Mar 15, 2012, at 1:12 PM, Laurie Molina wrote:

Hi guys,
I was wondering if stata has any command for longitudinal sampling.

I have a database which allows me to follow the same observations over
time. However, some of these observations dissapear over time, and
some new observations appear as well, as time goes by.
I would like to take a sample that is representative at every period
of time, that captures the attrition rate of the population, as well
as the rate of new observations entry.
Finally, i would like to be able to update this sample over time, that
is: if i have a sample that satisfies the above requirements from time
T to time T+10, in time T+11 i want to be able to take a new sample,
only on time T+11 observations, and add this new sample to the T to
T+10 sample database, and make sure that it still satisfies (taking in
to account the new T+11 observations), all the above requirements.

Do you think that it is possible to do such kind of sampling in Stata?

Thank you all very much!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

References:
- st: Longitudinal sampling (many waves)
  - From: Laurie Molina <[email protected]>

Prev by Date: st: Keeping an entire cell based on an observation taking a particular value
Next by Date: Re: st: Keeping an entire cell based on an observation taking a particular value
Previous by thread: Re: st: Longitudinal sampling (many waves)
Next by thread: st: update to stpm2
Index(es):
- Date
- Thread