[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: Data Management |

Date |
Tue, 18 Nov 2008 11:15:50 -0000 |

Matt's approach -drop-s only the last observation in each such spell that is 3 long, and the second last in each such spell that is 4 long, and so forth. -tsspell- on SSC is a convenience command for such problems. Its help file is detailed, with several worked examples, including problems similar to Raphael's. As its name implies, you must -tsset- your data before use. That is painless: . tsset id timevar Now define spells as sequences of zeros: . tsspell, cond(count == 0) -tsspell- automatically respects the panel structure of your data. It creates new variables with default names _spell, _seq, and _end. See the help for explanation if these names are not obvious. The length of each spell is returned like this: . egen length = max(_seq), by(id _spell) Now we are home and dry: . drop if length >= 3 The variables _spell, _seq, _end could be -drop-ped if they were no further use. Alternatively, this article spells out the principles of doing it yourself: SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: Identifying spells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q2/07 SJ 7(2):249--265 (no commands) shows how to handle spells with complete control over spell specification -findit spell- would have pointed to these and other stuff. Nick n.j.cox@durham.ac.uk Matt Spittal One way of approaching this is to use a combination of the -bysort- command and the explicit subscripting commands. For instance: bysort id: gen nzero = 1 if count[_n - 2] == 0 & count[_n - 1] == 0 & count == 0 should identify the observations where there are three consecutive zeros for each person. (If it isn't quite what you want, a variation on this will do the trick.) Then drop if nzero == 1 will exclude these observations from the dataset. Alternatively, something like xtpoisson count if nzero != 1 (or whatever commands you are using) will keep all the observations in the dataset, but exclude them from the analysis. A very good description of subscripting within groups is given in the User's Guide in section 13.7.2, for Stata version 10. Raphael Fraser I have longitudinal data with "id" as unique identifier, "timevar" as the time variable and an outcome variable I call "count." The timevar contains the elapsed time in minutes. I would like to exclude all zeros where there are 3 or more consecutive zeros for each person. Can anyone help? id timevar count 1 1 56 1 2 2 1 3 0 1 4 0 1 5 0 1 6 0 1 7 5 1 8 0 1 9 0 2 1 230 2 2 0 2 3 0 2 4 19 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Data Management***From:*"Raphael Fraser" <raphael.fraser@gmail.com>

**st: RE: Data Management***From:*"Matt Spittal" <Matt.Spittal@cancervic.org.au>

- Prev by Date:
**st: RE: reallocating variables within observations** - Next by Date:
**R: st: RE: sargan test for dynamic panel data** - Previous by thread:
**st: RE: Data Management** - Next by thread:
**Re: st: Data Management** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |