For illustrative purposes, subject 31 would be classified as sustained due to Trace amounts being present in 1989, 1990 and 1991. Therefore the gaps are included when counting consecutive years. If we had another subject with gaps, say id dot protein 5 1989 T 5 1992 T 5 1993 0 5 1994 T The above subject could not be classified due to the gaps at 1990 & 1991. On 10/18/05, Nick Cox <n.j.cox@durham.ac.uk> wrote: > As earlier said, you need to decide how gaps are to be handled! > > Nick > n.j.cox@durham.ac.uk > > Raphael Fraser > > > Yes, gaps do exist in the data since some patients did not turn up for > > their yearly test while others may have turn up three times for the > > year. > > > > > In addition, it seems possible in principle that > > > an individual could be assigned to two or more > > > classes according to different parts of their history. > > > > This is true. Therefore if a patient is classed as minimal, sustained > > and heavy, the latter is chosen. If another is assigned minimal and > > sustained then sustained is chosen. > > > > On 10/18/05, Nick Cox <n.j.cox@durham.ac.uk> wrote: > > > Your example indicates that gaps may exist in the data. > > > Subject 31 was not measured in various years e.g. 1987, > > > 1988. Thus if you want classification according to > > > consecutive years you need to specify how missing > > > [meaning, not present in the data] values are to be > > > treated. Do you mean just consecutive tests? > > > > > > In addition, it seems possible in principle that > > > an individual could be assigned to two or more > > > classes according to different parts of their history. > > > > > > So, I am not clear that writing specific code is > > > the best answer to you until these ambiguities are > > > resolved. But if you go > > > > > > . bysort id (dot) : gen t = _n > > > . tsset id t > > > > > > you can look for spells in your data according to > > > your stated criteria. The user-written > > > program -tsspell- from SSC can be then used. It has > > > a fairly detailed help file. > > > > > > Nick > > > n.j.cox@durham.ac.uk > > > > > > Raphael Fraser > > > > > > > I have a longitudinal data set that contains nearly 500 > > patients. All > > > > patients were tested at these times dot (date of test) > > for the level > > > > of protein in the blood; the result being 0 (no protein) T (trace > > > > amounts of protein), 1, 2, 3 or 4. I would like to classify these > > > > subjects based on the criteria below: > > > > > > > > "Minimal" if protein is T on at least 2 out of 3 > > consecutive years. > > > > "Sustained" if the result is minimal and lasts 3 years or more. > > > > "Heavy" if sustained with protein 2 or greater lasting 3 > > > > years or more. > > > > > > > > id dot protein > > > > 31 15mar1985 T > > > > 31 14mar1986 0 > > > > 31 15mar1989 T > > > > 31 15mar1990 T > > > > 31 15mar1991 T > > > > 31 15mar1993 0 > > > > 31 18feb1994 T > > > > 31 07jun1995 0 > > > > 31 23aug1996 1 > > > > 31 10may1999 T > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

