Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: Data Management


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: Data Management
Date   Tue, 18 Nov 2008 11:15:50 -0000

Matt's approach -drop-s only the last observation in each such spell
that is 3 long, and the second last in each such spell that is 4 long,
and so forth. 

-tsspell- on SSC is a convenience command for such problems. Its help
file is detailed, with several worked examples, including problems
similar to Raphael's. 

As its name implies, you must -tsset- your data before use. That is
painless: 

. tsset id timevar 

Now define spells as sequences of zeros: 

. tsspell, cond(count == 0) 

-tsspell- automatically respects the panel structure of your data. It
creates new variables with default names _spell, _seq, and _end. See the
help for explanation if these names are not obvious. 

The length of each spell is returned like this: 

. egen length = max(_seq), by(id _spell) 

Now we are home and dry: 

. drop if length >= 3 

The variables _spell, _seq, _end could be -drop-ped if they were no
further use. 

Alternatively, this article spells out the principles of doing it
yourself: 

SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata: Identifying
spells
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        Q2/07   SJ 7(2):249--265                                 (no
commands)
        shows how to handle spells with complete control over
        spell specification


-findit spell- would have pointed to these and other stuff. 

Nick 
n.j.cox@durham.ac.uk 

Matt Spittal

One way of approaching this is to use a combination of the -bysort-
command and the explicit subscripting commands. For instance:

	bysort id: gen nzero = 1 if  count[_n - 2] == 0 & count[_n - 1]
== 0 & count == 0

should identify the observations where there are three consecutive zeros
for each person. (If it isn't quite what you want, a variation on this
will do the trick.) Then

	drop if nzero == 1

will exclude these observations from the dataset. Alternatively,
something like 

	xtpoisson count if nzero != 1

(or whatever commands you are using) will keep all the observations in
the dataset, but exclude them from the analysis.  A very good
description of subscripting within groups is given in the User's Guide
in section 13.7.2, for Stata version 10.

Raphael Fraser

I have longitudinal data with "id" as unique identifier, "timevar" as
the time variable and an outcome variable I call "count." The timevar
contains the elapsed time in minutes. I would like to exclude all
zeros where there are 3 or more consecutive zeros for each person. Can
anyone help?

id  timevar count
1    1         56
1    2         2
1    3         0
1    4         0
1    5         0
1    6         0
1    7         5
1    8         0
1    9         0
2    1         230
2    2         0
2    3         0
2    4        19


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index