Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Longitudinal sampling (many waves)

From   Steve Samuels <>
Subject   Re: st: Longitudinal sampling (many waves)
Date   Mon, 19 Mar 2012 13:14:11 -0400

Laurie Molina:

You asked if there is a command for longitudinal sampling. Brendan's suggestion is valid. Randomly sample units who enter in the same time period and follow them for as long as they remain in the database. There will be no distinction between the sample in T+10 and the one in T+11.

Sample with fixed probability of 1 in k to assure equal weights for the cross-sectional analyses.  Start  a dataset containing for each individual their study ID and entry month. To take, e.g., a 1 in 200 sample:
 set seed [YOUR CHOICE]
 sample 0.5 , by(entry_period) 

You can add important strata to the by() clause.

For longitudinal analyses, no simple sampling plan will compensate for attrition.  You would have to take in each period a new sample of continuing units who "resemble" those lost in the period with respect to entry date and other characteristics.  Re-weighting the continuing sample members is preferable, I think; propensity score and weighting class approaches are both popular.



This is essentially the same question that you posted on February 22. You've either missed or ignored Brendan Halpin's excellent response at


On Mar 15, 2012, at 1:12 PM, Laurie Molina wrote:

Hi guys,
I was wondering if stata has any command for longitudinal sampling.

I have a database which allows me to follow the same observations over
time. However, some of these observations dissapear over time, and
some new observations appear as well, as time goes by.
I would like to take a sample that is representative at every period
of time, that captures the attrition rate of the population, as well
as the rate of new observations entry.
Finally, i would like to be able to update this sample over time, that
is: if i have a sample that satisfies the above requirements from time
T to time T+10, in time T+11 i want to be able to take a new sample,
only on time T+11 observations, and add this new sample to the T to
T+10 sample database, and make sure that it still satisfies (taking in
to account the new T+11 observations), all the above requirements.

Do you think that it is possible to do such kind of sampling in Stata?

Thank you all very much!
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index