.-
help for ^st_rpool^                           @net:from http://www.stata.com/users/wgould!(http://www.stata.com/users/wgould)@
.-

Survival-analysis subroutines for programmers
---------------------------------------------

	^st_rpool^ newvar [^if^ exp] [^in^ range] [^, s^trata^(^varnames^) nop^reserve ]


^st_rpool^ is for use with survival-time data; see help @st@.  You must have ^stset^
your data before using this command; see help @stset@.


Description
-----------

^st_rpool^ converts the st data in memory to data on the risk pools.  Variable
newvar is created indexing the pools.  After conversion, in addition to the
original variables in the dataset, the new variables are 

	newvar (as specified)   pool id
	^_t^                      analysis time of failure for subject
	^_t0^                     analysis time of "entry" for subject
        ^_d^                      1 if subject fails in this pool; 0 otherwise

Note that ^_t^ and ^_t0^ are the failure and entry times for the subject
regardless of pool.  (^_t0^ is typically 0 for all observations.)  The failure
time of the pool is the minimum value, within pool, of ^_t^.

The original failure and time variables that were ^stset^ are removed.

For instance, 

	. ^stset failtime, failure(death)^
	. ^st_rpool poolid^

would result in a dataset with new variables ^poolid^, ^_t0^, ^_t^, and ^_d^, but
without the variables ^failtime^ and ^death^.  

Technical note:  This version of st_rpool is compatible with modern, Stata 6
    programs and older, Stata 5 programs running in Stata 6.


Options
-------

^strata(^varnames^)^ specifies that separate pools are to be formed for the strata
    denoted by varnames.  varnames may be string or numeric or any combination.

^nopreserve^ specifies the original data is not to be preserved prior to
    transformation.  Note, this preservation does not refer to permanent
    saving of the data.  If ^nopreserve^ is not specified, the data is
    temporarily saved so, should anything go wrong during the transformation
    such as running out of memory, the original data can be brought back.  If
    the transformation completes without problem, the preserved copy of the
    original data is then erased.

    If you wish to retain a copy of the original data, it is your
    responsibility to save the data first regardless of whether you specify 
    this option.


Remarks
-------

The risk pools R1, R2, ... are each defined as the set of observations at risk
at first failure time, second failure, ... in the data.

Consider the following simple data:

        . ^list^

                    id       time       dead          x  
          1.       101          1          1       6.18  
          2.       102          4          1        .61  
          3.       103          6          0       5.55  

	. ^stset time, failure(dead)^

This data, converted to risk pools, becomes:

	. ^st_rpool event^

	. ^sort event id^

        . ^list^

                 event         id         _t         _d          x  
          1.         1        101          1          1       6.18  
          2.         1        102          4          0        .61  
          3.         1        103          6          0       5.55  
          4.         2        102          4          1        .61  
          5.         2        103          6          0       5.55  

event==1 signifies the risk pool at the time of the first failure in the data.
At that time, there were three "persons" who might have potentially failed:
ids 101, 102, and 03.  Id 101 did fail.

event==2 signifies the risk pool at the time of the second failure.
At that time, there were two persons who might have failed, ids 102 and 103.
Id 102 did fail.

Risk pools indices reflect failure times, not the failures themselves.  If
there are multiple failures at a certain time, it is still one risk pool.
In the following data there are two failures at time 7:

        . ^list^

                    id       time       dead          x  
          1.       201          1          1        .87  
          2.       202          7          0        .26  
          3.       203          7          1        .04  
          4.       204          8          1        .42  
          5.       205          8          1         .9  
          6.       206          9          0        .52  

	. ^stset time, failure(dead)^

This data, converted to risk pools, becomes,

	. ^st_rpool event^

	. ^sort event id^

	. ^by event: list id _t _d x^

        -> event=        1  
                    id         _t         _d          x  
          1.       201          1          1        .87  
          2.       202          7          0        .26  
          3.       203          7          0        .04  
          4.       204          8          0        .42  
          5.       205          8          0         .9  
          6.       206          9          0        .52  

        -> event=        2  
                    id         _t         _d          x  
          7.       202          7          0        .26  
          8.       203          7          1        .04  
          9.       204          8          0        .42  
         10.       205          8          0         .9  
         11.       206          9          0        .52  

        -> event=        3  
                    id         _t         _d          x  
         12.       204          8          1        .42  
         13.       205          8          1         .9  
         14.       206          9          0        .52  


Note that there are two failures in the third risk pool (which corresponds 
to the failure time 8).

Also note that risk pools can contain single observations:

        . ^list^

                    id       time       dead          x  
          1.       301          1          1        .84  
          2.       302          7          0        .21  
          3.       303          7          1        .56  
          4.       304          8          1        .26  
          5.       305          8          1        .95  
          6.       306          9          1        .28  

	. ^stset time, failure(dead)^

        . ^st_rpool event^

        . ^sort event id^

        . ^by event: list id _t _d x^

        -> event=        1  
                    id         _t         _d          x  
          1.       301          1          1        .84  
          2.       302          7          0        .21  
          3.       303          7          0        .56  
          4.       304          8          0        .26  
          5.       305          8          0        .95  
          6.       306          9          0        .28  

        -> event=        2  
                    id         _t         _d          x  
          7.       302          7          0        .21  
          8.       303          7          1        .56  
          9.       304          8          0        .26  
         10.       305          8          0        .95  
         11.       306          9          0        .28  

        -> event=        3  
                    id         _t         _d          x  
         12.       304          8          1        .26  
         13.       305          8          1        .95  
         14.       306          9          0        .28  

        -> event=        4  
                    id         _t         _d          x  
         15.       306          9          1        .28  

Note that the final risk pool contains only one observation because only one
"person" was left alive at that time.

^st_rpool^ handles entry time and will allow stratification, meaning separate
pools for separate groups.  For instance,

        . ^list^

                    id         t0       time       dead      group  
          1.       401          0          4          1          1  
          2.       402          0          8          0          1  
          3.       403          0         12          1          1  
          4.       404          9         16          0          1  
          5.       405          0          4          1          2  
          6.       406          4          8          0          2  
          7.       407          0         12          1          2  

	. ^stset time, failure(dead) time0(t0)^

	. ^st_rpool event, strata(group)^

	. ^sort event id^

        . ^by event: list id _t0 _t _d group^

        -> event=        1  
                    id        _t0         _t         _d      group  
          1.       401          0          4          1          1  
          2.       402          0          8          0          1  
          3.       403          0         12          0          1  

        -> event=        2  
                    id        _t0         _t         _d      group  
          4.       403          0         12          1          1  
          5.       404          9         16          0          1  

        -> event=        3  
                    id        _t0         _t         _d      group  
          6.       405          0          4          1          2  
          7.       407          0         12          0          2  

        -> event=        4  
                    id        _t0         _t         _d      group  
          8.       407          0         12          1          2  


An example use of ^st_rpool^
--------------------------

The Cox proportional hazard models is the same as the conditional logistic
model where the groups are the risk pools.  Thus, one way of obtaining 
an estimate of the Cox model is

	. ^st_rpool event^

	. ^clogit _d^ ...^, group(event)^

Results will be the same as estimated by 

	. ^stcox^ ...^, exactp^


Author
------

     William Gould
     StataCorp.
     wgould@@stata.com
     15 Jan 1999


Also see
--------

    STB:  STB-37 ssa9
On-line:  help for @st@, @st_is@