Stata 15 help for stsplit

[ST] stsplit -- Split and join time-span records

Syntax

Split at designated times

stsplit newvar [if], {at(numlist) | every(#)} [stsplitDT_options]

Split at failure times

stsplit [if], at(failures) [stsplitFT_options]

Join episodes

stjoin [, censored(numlist)]

stsplitDT_options Description ------------------------------------------------------------------------- Main * at(numlist) split records at specified analysis times * every(#) split records when analysis time is a multiple of # after(spec) use time since spec for at() or every() rather than time since onset of risk trim exclude observations outside of range

nopreserve do not save original data; programmer's option ------------------------------------------------------------------------- * Either at(numlist) or every(#) is required with stsplit at designated times.

stsplitFT_options Description ------------------------------------------------------------------------- Main * at(failures) split at observed failure times strata(varlist) restrict splitting to failures within stratum defined by varlist riskset(newvar) create a risk-set ID variable named newvar

nopreserve do not save original data; programmer's option ------------------------------------------------------------------------- * at(failures) is required with stsplit at failure times.

You must stset your dataset by using the id() option before using stsplit or stjoin; see [ST] stset. nopreserve does not appear in the dialog box.

Menu

stsplit

Statistics > Survival analysis > Setup and utilities > Split time-span records

stjoin

Statistics > Survival analysis > Setup and utilities > Join time-span records

Description

stsplit with the at(numlist) or every(#) option splits episodes into two or more episodes at the implied time points since being at risk or after a time point specified by after(). Each resulting record contains the follow-up on one subject through one time band. Expansion on multiple time scales may be obtained by repeatedly using stsplit. newvar specifies the name of the variable to be created containing the observation's category. The new variable records the interval to which each new observation belongs and is bottom coded.

stsplit, at(failures) performs episode splitting at the failure times (per stratum).

stjoin performs the reverse operation, namely, joining episodes back together when such can be done without losing information.

Options for stsplit

+------+ ----+ Main +-------------------------------------------------------------

at(numlist) or every(#) is required in syntax one; at(failures) is required for syntax two. These options specify the analysis times at which the records are to be split.

at(5(5)20) splits records at t=5, t=10, t=15, and t=20.

If at([...]max) is specified, max is replaced by a suitably large value. For instance, to split records every five analysis-time units from time zero to the largest follow-up time in our data, we could find out what the largest time value is by typing summarize _t and then explicitly typing it into the at() option, or we could just specify at(0(5)max).

every(#) is a shorthand for at(#(#)max); that is, episodes are split at each positive multiple of #.

after(spec) specifies the reference time for at() or every(). Syntax one can be thought of as corresponding to after(time of onset of risk), although you cannot really type this. You could type, however, after(time=birthdate) or after(time=marrydate) or after(marrydate).

spec has syntax

[{time | t | _t} =] {exp | min(exp) | asis(exp)}

where

time specifies that the expression be evaluated in the same time units as timevar in stset timevar, .... This is the default.

t and _t specify that the expression be evaluated in units of analysis time. t and _t are synonyms; it makes no difference whether you specify one or the other.

exp specifies the reference time. For multiepisode data, exp should be constant within subject ID.

min(exp) specifies that for multiepisode data, the minimum of exp be taken within ID.

asis(exp) specifies that for multiepisode data, exp be allowed to vary within id.

trim specifies that observations with values less than the minimum or greater than the maximum value listed in at() be excluded from subsequent analysis. Such observations are not dropped from the data; trim merely sets their value of variable _st to 0 so they will not be used, yet they can still be retrieved the next time the dataset is stset.

strata(varlist) specifies up to five strata variables. Observations with equal values of the variables are assumed to be in the same stratum. strata() restricts episode splitting to failures that occur within the stratum, and memory requirements are reduced when strata are specified.

riskset(newvar) specifies the name for a new variable recording the unique risk set in which an episode occurs, and missing otherwise.

The following option is available with stsplit but is not shown in the dialog box:

nopreserve is intended for use by programmers. It speeds the transformation by not saving the original data, which can be restored should things go wrong or if you press Break. Programmers often specify this option when they have already preserved the original data. nopreserve does not affect the transformation.

Option for stjoin

censored(numlist) specifies values of the failure variable, failvar, from stset ..., failure(failvar ...) that indicate "no event" (censoring).

If you are using stjoin to rejoin records after stsplit, you do not need to specify censored(). Just do not forget to drop the variable created by stsplit before typing stjoin. See example 4 in [ST] stsplit.

Neither do you need to specify censored() if, when you stset your dataset, you specified failure(failvar) and not failure(failvar=...). Then stjoin knows that failvar = 0 and failvar = . (missing) correspond to no event. Two records can be joined if they are contiguous and record the same data and the first record has failvar = 0 or failvar = ., meaning no event at that time.

You may need to specify censored(), and you probably do if, when you stset the dataset, you specified failure(failvar=...). If stjoin is to join records, it needs to know what events do not count and can be discarded. If the only such event is failvar = ., then you do not need to specify censored().

Examples of splitting data at designated times

--------------------------------------------------------------------------- Setup . webuse diet

Describe the dataset . describe

Declare data to be survival-time data . stset dox, failure(fail) origin(time dob) enter(time doe) scale(365.25) id(id)

List some of the data . list id dob doe dox fail _t0 _t if id == 1 | id == 34

Split data by age at designated times . stsplit ageband, at(40(10)70)

List some of the data . list id _t0 _t ageband fail height if id == 1 | id == 34

Split data by time-in-study, too . stsplit timeband, at(0(5)25) after(time=doe)

List some of the data . list id _t0 _t ageband timeband if id == 1 | id == 34

--------------------------------------------------------------------------- Setup . webuse stanford, clear

Create variables that preserve follow-up time such that time of transplant is the same for all patients . generate enter = 320 - wait . generate exit = 320 + stime

Declare data to be survival-time data . stset exit, enter(time enter) failure(died) id(id)

Split data at time of transplant . stsplit posttran, at(0,320) ---------------------------------------------------------------------------

Examples of splitting data at failure times

--------------------------------------------------------------------------- Setup . webuse ocancer, clear

List some of the data . list in 1/6, sep(0)

Declare data to be survival-time data . stset time, failure(cens) id(patient)

Split data at failure times . stsplit, at(failures)

--------------------------------------------------------------------------- Setup . webuse cancer, clear

Generate an ID variable . generate id = _n

Declare data to be survival-time data . stset studytime, failure(died) id(id)

Split data at failure times, adding a risk set identifier to each observation . stsplit, at(failures) riskset(RS) ---------------------------------------------------------------------------

Example of joining episodes

Setup . webuse diet, clear

Declare data to the survival-time data . stset dox, failure(fail) origin(time dob) enter(time doe) scale(365.25) id(id)

Split data by age at designated times . stsplit ageband, at(40(10)70)

Drop the variable that stsplit created . drop ageband

Join data that has been split . stjoin

Confirm that data matches diet.dta, except for variables created by stsetting the data . cf _all using http://www.stata-press.com/data/r15/diet, all


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index