[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: : [sampling in Cox model] (more)

From	"Sayer, Bryan" <[email protected]>
To	"'FEIVESON, ALAN H. (AL) (JSC-SK) (NASA) '" <[email protected]>, "''[email protected]' '" <[email protected]>
Subject	st: RE: : [sampling in Cox model] (more)
Date	Thu, 19 Jun 2003 10:35:33 -0400

I'm sure there is a way to do informative sampling, but I think it depends
on what one is sampling for.  Typically, in a survey sample situation,
sampling is done based on information available BEFORE the sample is drawn,
i.e. census area data, administrative list, etc.  To use information
available only after the data is collected may bias the results.

So if the purpose is only to draw a sample from the dataset that will yield
estimates as close as possible to those obtained with a full dataset, then
one probably needs to pick observations around the mean of the independent
variables, possibly stratified by failure or failure time.  Of course
variance estimates will be off.

Another issue with a survival model that often seems to be lost is that
estimation is based on those still at risk as each failure occurs.  More
simply, mortality is ultimately 100%.

So rather than sampling, I wonder if it would be possible to simply
aggregate the data at each failure time?

Bryan Sayer
Statistician, SSS Inc.

-----Original Message-----
From: FEIVESON, ALAN H. (AL) (JSC-SK) (NASA)
To: '[email protected]'
Sent: 6/19/03 10:06 AM
Subject: st: : [sampling in Cox model]  (more)

To rephrase what I am asking, I think one ought to be able to do some
sort
of stratified sampling. The strata would be based on the independent
variables. For example, if one stratum has "risky" values of the
independent
variables, one would expect a lot of failures, ..etc. Bryan - are you
there?
Can this be done in the context of fitting Cox models?

Al

-----Original Message-----
From: FEIVESON, ALAN H. (AL) (JSC-SK) (NASA) 
Sent: Thursday, June 19, 2003 8:35 AM
To: '[email protected]'
Subject: st: [sampling in Cox model] 


This raises an interesting question. Clearly, one could take an
"upstream"
that is, a purely blinded sample and run it. But is there a more
efficient
way? For example, if you just used the failures, you would bias your
estimates of the coefficients, but in some sense you would gain
precision.
So I'm wondering if there is a way of informative sampling (that is
purposely choosing a preponderance of failures) and somehow correcting
for
bias? If you did this, would the estimates be any more accurate than if
you
had just taken a noninformative sample to begin with?

Al Feiveson

-----Original Message-----
From: Nick Cox [mailto:[email protected]]
Sent: Wednesday, June 18, 2003 6:00 PM
To: [email protected]
Subject: st: RE: [Cox model] 


roger webb
 
> I need to run a Cox model on a very large cohort (of approximately
> 1.5 million subjects). Has anyone implemented a memory efficient
> routine that uses a sample from (as opposed to all) the individuals
> at risk?

Nothing to do with me, but I doubt that there 
is amy special procedure needed here. That is,
you just should take a sample upstream and then 
fit a Cox model on the sample data, I guess. 

Nick 
[email protected] 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Fixed Effects Panel Regression with correlation between panels
Next by Date: st: List of Stata commands
Previous by thread: st: Fixed Effects Panel Regression with correlation between panels
Next by thread: st: List of Stata commands
Index(es):
- Date
- Thread