Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: missing events in stset


From   Sara Mottram <s.mottram@cphc.keele.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: missing events in stset
Date   Tue, 08 May 2007 09:44:07 +0100

Thank-you for your help, Bill. I have got -stset- to identify all of the events now. There were several problems, all of which you had mentioned. I'm afraid my programming skills aren't all that good.

Best wishes
Sara

William Gould, Stata wrote:

Sara Mottram <s.mottram@cphc.keele.ac.uk> writes,

I am having some difficulty with -stset-. I'm almost certain that the fault lies with my data, as this same command has worked before in a similar dataset. However, I wonder if anyone could give me an idea as to where I might start looking to find the problem.

[...]

[...] I know from a tabulation of the data that there are 734 consultations, but when I use -stset- it identifies 730 events. One person consults at time 0, so I think this person is being ignored - I understand this. However, this still leaves three events that are unidentified.

[...]
And Sara included the following output:

-------------------------------------------------------------------
. stset cons_dt, id(surveyid) fail(kcons_post_3yr==1) origin(time > edateass) exit(time censor_date)

id: surveyid
failure event: kcons_post_3yr == 1
obs. time interval: (cons_dt[_n-1], cons_dt]
exit on or before: time censor_date
t for analysis: (time-origin)
origin: time edateass

----------------------------------------------------------
16704 total obs.
28 obs. end on or before enter()
----------------------------------------------------------
16676 obs. remaining, representing
742 subjects
730 failures in multiple failure-per-subject data
703420 total analysis time at risk, at risk from t = 0
earliest observed entry t = 0
last observed exit t = 1096
-------------------------------------------------------------------


First, Sara, notice how Stata writes the time interval:

obs. time interval: (cons_dt[_n-1], cons_dt]

That is ( meaning open interval and ] meaning closed interval. Hence, a subject with the interval (0,0] makes no sense. That subject failed before he or she entered.

Do you have other examples like this. Do you, perhaps, have someone else with interval (12,12] or (20,20]? That would be the same story.

Note that -stset- reported
28 obs. end on or before enter()

so Sara must have obs like (12,12] or (20,20], or she has more obvious errors such as (20,12].

Assuming the problems are all of the form (12,12] and (20,20], I would do the following:

. replace censor_date = censor_date + .125

and try again. I'm assuming that Sara's dates are all integers and so moving all the censoring dates forward just a little won't matter.
There's nothing magic about .125; Sara could use .0625 or .03125 or even, say .00390625. Or .1, .01, .001, etc. The only reason I don't use nice numbers like .1, and .01 is that binary computers cannot store exactly negative powers of 10, and so later, I cannot type things like
. list if censor_date==12.1

I have to type things like
. list if censor_date==float(12.1)

and I invariably forget, so I use negative powers of 2 to shift dates.

Anyway, perhaps moving the end dates forward just a little will solve the
problem.

Or maybe not. Sara has lots of dates in her files. Quoting from the output again:

obs. time interval: (cons_dt[_n-1], cons_dt]
exit on or before: time censor_date
t for analysis: (time-origin)
origin: time edateass

So we need to look at cons_dt as well. And we need to look censor_date and
edateass carefully, because Sara has multiple records per subject.

I would do the following:

. sort surveyid cons_dt

// make sure dates are growing
. by surveyid: assert cons_dt > cons_dt[_n-1] if _n>1

// make sure censor_date is constant
. by surveyid: assert censor_date == censor_date[1]

// make sure edateass is constant
. by surveyid: assert edateass == edateass[1]

// make sure censor_date after enter date
. by surveyid: assert censor_date > cons_dt[1]

-- Bill
wgould@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
Sara Mottram	
Research Assistant: Biostatistics
Primary Care Musculoskeletal Research Centre
Primary Care Sciences
Keele University
Staffordshire, ST5 5BG
Tel:  	+44 (0) 1782 584711
Fax:  	+44 (0) 1782 583911
Email:	s.mottram@cphc.keele.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index