Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Classifying Subjects


From   Raphael Fraser <raphael.fraser@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Classifying Subjects
Date   Tue, 18 Oct 2005 12:58:47 -0500

For illustrative purposes, subject 31 would be classified as sustained
due to Trace amounts being present in 1989, 1990 and 1991. Therefore
the gaps are included when counting consecutive years. If we had
another subject with gaps, say

id    dot      protein
5     1989     T
5     1992     T
5     1993     0
5     1994     T

The above subject could not be classified due to the gaps at 1990 & 1991.


On 10/18/05, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> As earlier said, you need to decide how gaps are to be handled!
>
> Nick
> n.j.cox@durham.ac.uk
>
> Raphael Fraser
>
> > Yes, gaps do exist in the data since some patients did not turn up for
> > their yearly test while others may have turn up three times for the
> > year.
> >
> > > In addition, it seems possible in principle that
> > > an individual could be assigned to two or more
> > > classes according to different parts of their history.
> >
> > This is true. Therefore if a patient is classed as minimal, sustained
> > and heavy, the latter is chosen. If another is assigned minimal and
> > sustained then sustained is chosen.
> >
> > On 10/18/05, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> > > Your example indicates that gaps may exist in the data.
> > > Subject 31 was not measured in various years e.g. 1987,
> > > 1988. Thus if you want classification according to
> > > consecutive years you need to specify how missing
> > > [meaning, not present in the data] values are to be
> > > treated. Do you mean just consecutive tests?
> > >
> > > In addition, it seems possible in principle that
> > > an individual could be assigned to two or more
> > > classes according to different parts of their history.
> > >
> > > So, I am not clear that writing specific code is
> > > the best answer to you until these ambiguities are
> > > resolved. But if you go
> > >
> > > . bysort id (dot) : gen t = _n
> > > . tsset id t
> > >
> > > you can look for spells in your data according to
> > > your stated criteria. The user-written
> > > program -tsspell- from SSC can be then used. It has
> > > a fairly detailed help file.
> > >
> > > Nick
> > > n.j.cox@durham.ac.uk
> > >
> > > Raphael Fraser
> > >
> > > > I have a longitudinal data set that contains nearly 500
> > patients. All
> > > > patients were tested at these times dot (date of test)
> > for the level
> > > > of protein in the blood; the result being 0 (no protein) T (trace
> > > > amounts of protein), 1, 2, 3 or 4. I would like to classify these
> > > > subjects based on the criteria below:
> > > >
> > > > "Minimal" if protein is T on at least 2 out of 3
> > consecutive years.
> > > > "Sustained" if the result is minimal and lasts 3 years or more.
> > > > "Heavy" if sustained with protein  2 or greater lasting 3
> > > > years or more.
> > > >
> > > > id            dot       protein
> > > > 31    15mar1985       T
> > > > 31    14mar1986       0
> > > > 31    15mar1989       T
> > > > 31    15mar1990       T
> > > > 31    15mar1991       T
> > > > 31    15mar1993       0
> > > > 31    18feb1994        T
> > > > 31    07jun1995        0
> > > > 31    23aug1996       1
> > > > 31    10may1999      T
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index