Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Steve Samuels <sjsamuels@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Longitudinal sampling (many waves) |
Date | Mon, 19 Mar 2012 16:33:42 -0400 |
Oops! 1 in 200 is likely not to pick anything in the auto data set. Try 1 in 5 instead: *********** samplesys 5 *********** In case there are few late-entrants to the panel, a simple random sample in each time period after initial enrollment might not feasible. If that's so, order new entrants by strata (optionally) and entry date; then take a systematic sample. ***************CODE BEGINS******************* capture program drop _all program define samplesys, rclass /* Draw 1 in k systematic sample: syntax "samplesys k" */ syntax anything [if ] [in] args k marksample touse confirm integer number `k' tempname start scalar `start' = ceil(`k'*runiform()) keep if mod(_n-`start',`k')==0 & `touse' return scalar start = `start' end sysuse auto, clear gen entry_date = price sort foreign entry_date set seed 833332 samplesys 200 ***************CODE ENDS******************* Steve Laurie Molina: You asked if there is a command for longitudinal sampling. Brendan's suggestion is valid. Randomly sample units who enter in the same time period and follow them for as long as they remain in the database. There will be no distinction between the sample in T+10 and the one in T+11. Sample with fixed probability of 1 in k to assure equal weights for the cross-sectional analyses. Start a dataset containing for each individual their study ID and entry month. To take, e.g., a 1 in 200 sample: ************************** set seed [YOUR CHOICE] sample 0.5 , by(entry_period) ************************** You can add important strata to the by() clause. For longitudinal analyses, no simple sampling plan will compensate for attrition. You would have to take in each period a new sample of continuing units who "resemble" those lost in the period with respect to entry date and other characteristics. Re-weighting the continuing sample members is preferable, I think; propensity score and weighting class approaches are both popular. Steve sjsamuels@gmail.com -- This is essentially the same question that you posted on February 22. You've either missed or ignored Brendan Halpin's excellent response at http://www.stata.com/statalist/archive/2012-02/msg01033.html. Steve sjsamuels@gmail.com On Mar 15, 2012, at 1:12 PM, Laurie Molina wrote: Hi guys, I was wondering if stata has any command for longitudinal sampling. I have a database which allows me to follow the same observations over time. However, some of these observations dissapear over time, and some new observations appear as well, as time goes by. I would like to take a sample that is representative at every period of time, that captures the attrition rate of the population, as well as the rate of new observations entry. Finally, i would like to be able to update this sample over time, that is: if i have a sample that satisfies the above requirements from time T to time T+10, in time T+11 i want to be able to take a new sample, only on time T+11 observations, and add this new sample to the T to T+10 sample database, and make sure that it still satisfies (taking in to account the new T+11 observations), all the above requirements. Do you think that it is possible to do such kind of sampling in Stata? Thank you all very much! * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/