Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Joerg Luedicke <joerg.luedicke@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: sts graph: different results with -atrisk- vs -risktable- option |
Date | Fri, 15 Apr 2011 10:53:47 -0400 |
On Tue, Apr 12, 2011 at 8:49 AM, Alex Gamma <alex.gamma@uzh.ch> wrote: > Dear Statasticians, > > I have multiple-record survival-time data and -stset- them as shown further > below. There are 30 records for each of 591 subjects. > > My question: why do I get different numbers at risk with -sts > graph-'s -risktable- vs -atrisk- options? > > With -sts graph, risktable name(risktable, replace)-, I get numbers at risk > 192, 169, 102, 0 for analysis time 0, 10, 20, 30 years, respectively. > > With -sts graph, atrisk name(atrisk, replace)-, I get numbers at risk > 960, 382, 361, 338, 166, 328, 524, 232, 536 for the beginnings of each > interval. > > I'm aware that the time points for which the numbers at risk are calculated > are not the same for the two options, > but what worries me is that -atrisk- produces numbers that are much higher > than the number of subjects in the analysis. In fact, for the first time > point, the number reported by -atrisk- (960) is exactly the 5-fold of that > reported by risktable- (192). The latter number is the right one, since it > corresponds to the total number of subjects in the analysis, as shown in the > results of -stset-. > > I've searched the statalist archives, the Stata FAQ, the Stata manual entry > for sts graph, and googled "Stata risktable atrisk", but I found nothing > that addressed this discrepancy. > > Further below, I'm also including the results of -sts list-. Maybe > they help diagnose the problem. > This is indeed strange. I reproduced that problem with some toy data. If you consider the examples below, the first one shows correct results with the atrisk option. The second one seems incorrect and the difference between these two examples is that in the 2nd one there is no transition between adjacent time intervals and the wrong number given in the graph via -atrisk- seems to be the product of the risk population and the time intervals between there is no transition. So in your data, it probably happened that there is no transition to failure during 5 adjacent time intervals and hence Stata writes 960 instead of 192 into the graph. Anyway, there could other sources of problems be involved as well so I agree that you should send this to StataCorp. Below are the 2 examples: //Examples reproducing possible glitch in -sts graph-, -atrisk- option /*Example 1: Correct graph with atrisk option*/ clear all set seed 1345 set obs 10 gen id=_n gen dum=rbinomial(1,.4) expand 5 bys id: gen censor= rbinomial(1,.3) bys id: replace censor=1 if censor[_n-1]==1 bys id: gen time=_n stset time, id(id) f(censor) sts graph, risktable name(risktable1) sts graph, atrisk name(atrisk1) /*Example 2: Incorrect results with atrisk option*/ clear all set seed 123445 set obs 10 gen id=_n gen dum=rbinomial(1,.4) expand 5 bys id: gen censor= rbinomial(1,.3) bys id: replace censor=1 if censor[_n-1]==1 bys id: gen time=_n stset time, id(id) f(censor) sts graph, risktable name(risktable2) sts graph, atrisk name(atrisk2) //End examples J. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/