[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Daniel Waxman" <dan@amplecat.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: how to deal with censoring at zero (a lot of zeroes) for a laboratory re |

Date |
Wed, 8 Jun 2005 11:35:29 -0400 |

Thank you. I've discovered that the 'mfp' program (multivariable fractional polynomials) has a convenient 'zerocat' option, which basically automates the process of converting the zeroes to a separate binary predictor before fitting the model. Very useful! -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Arnold Kester Sent: Wednesday, June 08, 2005 9:09 AM To: statalist@hsphsun2.harvard.edu Subject: Re: st: how to deal with censoring at zero (a lot of zeroes) for a laboratory re Op 06/06/2005 01:17 PM schreef Daniel Waxman: > Maarten, Kevin, > > Thank you very much for your replies. So for now I am just going give up > trying to make distributional assumptions and to drop the half of the > observations which are zero or non-detectable prior to log transforming the > predictor and to creating the logistic model. In fact, whether I do this or > change the zero to half of the lowest detectable value (i.e. .005) doesn't > have much of an effect on the logistic odds ratio. > > If anybody has any objections to this (or sees how a statistical reviewer > for a medical journal might have objections), please let me know. If you drop observations based on their value of a predictor variable you are in fact changing the protocol of your study. The inclusion criteria are changed to include "Troponin I is detectable". Results would be valid for people with detectable values only. If you want to get a prediction for undetectable Troponin without assuming a specific value you could add a dummy variable troponin_zero = (troponin == 0) and substitute (say) zero for log(troponin) when troponin==0. The predicted value from this model is independent of what you choose for "log(0)". Arnold > > Daniel > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of maartenbuis > Sent: Sunday, June 05, 2005 7:38 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: how to deal with censoring at zero (a lot of zeroes) for a > laboratory re > > I am tired: The cirtical assumption behind Multiple Imputation is that > the probability of missingness does not depend on the value of the > missing variable itself (Missing At Random, or MAR). This is obviously > not the case with censoring. My objection against (conditional) mean > imputation, and my remark about selecting on the independent variables > still hold. So, given that you have a large number of observations, I > would just ignore the zero observations. > > Maarten > > --- In statalist@yahoogroups.com, "maartenbuis" <maartenbuis@y...> wrote: > >>Hi Daniel, >> >>It looks to me like you could use -tobit- for log(tropin) and just a >>constant. The predicted values should give you the extrapolations you >>want. (This will be the same value for all missing observations: the >>mean of the log-normal distribution conditional on being less than the >>censoring value) >> >>However, These are actually missing values, and apperently you want to >>create imputations for them. If you just use the values you obtained >>from -predict- you will be assuming that you are as sure about these >>values as you are about the values you actually observed, and thus get >>standard errors that are too small. If you really want to impute, than >>you could have a look at -mice- (findit mice). Alternatively, you >>could use the results from -tobit- to generate multiple imputations. >>Mail me if you want to do that, and I can write, tonight or tomorrow, >>an example for the infamous auto dataset. However, censoring on the >>independent variable is generally much less a problem than censoring >>on the dependent variable, so ignoring (throwing away) the censored >>observation, should not lead to very different estimates. >> >>HTH, >>Maarten >> >>--- "Daniel Waxman" <dan@a...> wrote: >> >>>I am modeling a laboratory test (Troponin I) as an independent >>>(continuous) predictor of in-hospital mortality in a sample of >>>10,000 subjects. <snip> The problem is the zero values, what they >>>represent, and what to do with them. The distribution of results >>>ranges from the minimal detectable level of .01 mcg/L to 94 mcg/L, >>>with results markedly skewed to the left (nearly half the results >>>are zero; 90% are < .20. results are given in increments >>>of .01). Of course, zero is a censored value which represents a >>>distribution of results between zero and somewhere below .01. >> >><snip. >> >>>I found a method attributed to A.C. Cohen of doing essentially this >>>which uses a lookup table to calculate the mean and standard >>>deviation of an assumed log-normal distribution based upon the >>>non-censored data and the proportion of data points that are >>>censored, but there must be a better way to do this in Stata. >>> >>>Any thoughts on (1) whether it is reasonable to assume the >>>log-normal distribution (I've played with qlognorm and plognorm, but >>>it's hard to know what is good enough), and if so (2) how to do it? >> >> >> >> >>* >>* For searches and help try: >>* http://www.stata.com/support/faqs/res/findit.html >>* http://www.stata.com/support/statalist/faq >>* http://www.ats.ucla.edu/stat/stata/ > > > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- Met vriendelijke groet, Arnold Kester * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: how to deal with censoring at zero (a lot of zeroes) fora laboratory re***From:*Arnold Kester <arnold.kester@stat.unimaas.nl>

- Prev by Date:
**st: RE: string with embedded blanks in logical expression** - Next by Date:
**st: Re: Re: Dummy Variable Trap** - Previous by thread:
**Re: st: how to deal with censoring at zero (a lot of zeroes) fora laboratory re** - Next by thread:
**RE: st: how to deal with censoring at zero (a lot of zeroes) for a laboratory result which I would like to log transform** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |