[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
James Harris <james_harris@paradise.net.nz> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Using sts list results in do-files |

Date |
Wed, 11 Jul 2007 16:27:58 +1200 |

James Harris wrote: > I'm running some simple survival analyses using > > sts list, failure at (1 12) by (gender ageband) > > My question is, can I access the results to use later in my do-file? > For example, how would I pick out failure rate at time 12, and put it > into a variable or matrix? > > Many thanks. I'm replying to my own question, as all I got from StataList was "try sts generate". I had a useful correspondance with someone who'd asked StataList this question earlier, and then in turn have been contacted by others who are dealing with the same problem. The solution I've found works in some - but not all - cases. If anyone is interested in generalising it, feel free! - James. The first method uses -tsset- to create the missing observations; then carries forward the survival rates: I'm sorry to say that I did not [get an answer from StataList], which is unfortunate. I can tell you how I addressed the problem since I had to get an answer .. I figured that, since the failure function is not a model and is essentially calculated at each particular failure time, I assumed that the failure function didn't change between successive timepoints. So, essentially, if I had a time 2 with failure rate A, and at the next timepoint 8 I have A+x, I assume my failure rate is A for time 3,4,5,6, and 7. It is not until timepoint 8 that I had the failure change. Likewise, if all my obs for one group were censored at time 10, and for my other group they were censored at time 22, the best estimate for my first group's failure function remains constant between time 10 and 22. Whether or not this line of reasoning applies to your problem, I can't say. But that's what I did, because having talked with a few of my colleagues it seemed reasonable in my case. To implement this, I did a tsset and tsfill (to create an observation for all missing timepoints 3-7, and filled the data accordingly, with a line something like the following, . bysort [grouping-variables] (_t): replace failure=failure[_n-1] if missing(failure) & ~missing(failure[_n-1]) The tsset method almost worked for me, except that it creates too many observations: I am working with a million or so hospitalisation records, that span 15 years, with a timescale of 1 day, and my aged computer simply ran out of memory. But the basic idea of carrying forward the survival rates to dummy records seemed like the way to go. So instead I created a file of observations that holds 1 record for each person for each time that I want the survival rate for: // set up the demographics file, with entries at the times of interest use ".\stata data\temp\demographics", clear // a file with a record for each unique person (identified by mast_enc) describe local BridgeDuration = 30 local DH_end = 366 local Yr2_end = 732 // two years gen time1 = `BridgeDuration' // days gen time2 = `DH_end' gen time3 = `Yr2_end' reshape long time, i(mast_enc) j(break) compress describe save ".\stata data\temp\demographics30", replace After the stset on my main dataset, I generate the survival parameter, and the same time variable as in my 'demographics' file sts generate survival = s format survival %6.4g gen time = _t I append the 'demographics' file to my main dataset, sort by time, then carryforward the survival rates: append using ".\stata data\temp\demographics30" sort mast_enc time _t // the sort by _t ensures that eg 30.x days goes after 30.0 days // mast_enc is the unique person ID // at 30 days, use earlier survival if there is one for this person // also carry forward the _st flag // and the ageband // unless previous record is also for a t=30 foreach BreakPoint in `BridgeDuration' `DH_end' `Yr2_end' /// the three breakpoints: end of bridge; rest of 1st year; second year { bysort mast_enc (time): replace _st = _st[_n-1] /// if (missing(_st) & (time == `BreakPoint') & (time[_n-1] ~= `BreakPoint') /// & (mast_enc == mast_enc[_n-1])) bysort mast_enc (time): replace survival = survival[_n-1] /// if ( missing(survival) & (time == `BreakPoint') & (time[_n-1] ~= `BreakPoint') /// & (mast_enc == mast_enc[_n-1])) //bysort mast_enc (time): replace ageband = ageband[_n-1] /// // if ( (time == `BreakPoint') & (time[_n-1] ~= `BreakPoint') & (mast_enc == mast_enc[_n-1])) } Each person in the dataset now has a survival rate at the three specified times. It can be tabluated with something like foreach BreakPoint in `BridgeDuration' `DH_end' `Yr2_end' { table gender ageband if time == `BreakPoint', contents (min survival freq) } if survival is decreasing over time. The 'min' function is needed if some people actually have an event at one of my three times of interest. Alternatively, you can create a dataset of the survival values with something like collapse (min) survival if inlist(time, `BridgeDuration' ,`DH_end' ,`Yr2_end') , by (ageband gender time ) The good news is that this method works well when each person only becomes at risk once - in my case, a chance of death after a heart attack. The bad news is that it fails dismally when each person may have multiple entries and failure events - eg when I want the chance of a heart attack given a previous heart attack. I've given up on solving that version - if anyone has any suggestions I'd be very grateful. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Graph Combine and Aspect Ratios** - Next by Date:
**st: Re: R and Stata efficiency** - Previous by thread:
**st: Table output** - Next by thread:
**st: Re: R and Stata efficiency** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |