Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Biomarker with lower detection limits


From   Eduardo Nunez <enunezb@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Biomarker with lower detection limits
Date   Wed, 18 Nov 2009 18:04:27 -0500

Thank you so much for your help.

Even though I know the DL, no idea about the biomarker distribution.

I may try all options suggested. However, I would like to know if
Stata perform a hurdle model?

Best wishes,

Eduardo

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This sounds like an application for a two-part or hurdle model:
You want to compare the frequency of below detection limit in two
populations (or to a standard) and the continuous variable above the
detection limit.
My inclination is NOT to use imputation - you already know these are
below the detection limit, so why impute something larger than that?
I've been wary of Tobit models since I read somewhere (and I don't
remember where, darn it) that they are quite sensitive to the
normality assumption.

Tony

Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001

On Tue, Nov 17, 2009 at 1:24 PM, Maarten buis <maartenbuis@yahoo.co.uk> wrote:
> --- On Tue, 17/11/09, Lachenbruch, Peter wrote:
>> My inclination is NOT to use imputation - you already know
>> these are below the detection limit, so why impute something
>> larger than that?
>
> Alternatively, you could use multiple imputation, as long as
> your imputation model respects this information you have
> about your variable. This is the kind of problem Patrick
> Royston seems to had in mind when writing this update to his
> -ice- command:
>
> Patrick Royston (2007) Multiple imputation of missing values:
> further update of ice, with an emphasis on interval censoring.
> The Stata Journal, 7(4):445-464.
> http://www.stata-journal.com/article.html?article=st0067_3
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
> http://www.maartenbuis.nl
> --------------------------


On Tue, Nov 17, 2009 at 2:08 PM, Austin Nichols <austinnichols@gmail.com> wrote:
> Eduardo Nunez <enunezb@gmail.com> :
> An alternative approach to consider: make a dummy variable that is one
> when your biomarker is nonzero.  Then run a probit, logit, or
> alternative model using the dummy for "above the detection limit" as
> the outcome. You can also use that dummy as an explanatory variable in
> stcox, possibly also with a second variable measuring values above the
> detection limit.  Do you know the detection limit?  Do you know
> details about the physical process that would tell you about the
> distribution of these measurements conditional on some X?  If so, you
> can write a -ml- routine using that information.
>
> On Tue, Nov 17, 2009 at 9:01 AM, Eduardo Nunez <enunezb@gmail.com> wrote:
>> Dear statalisters:
>>
>> I wonder if anyone can advise me on the best way to analyse continuous
>> variables with lower detection limits (or left censored).
>> In particular, I have data on a biomarker with 92% of values reported
>> "undetectable" and I am trying to run 2 models:
>> 1) a linear regression using it as dependent variables, and
>> 2) stcox with mortality as outcome and the biomarker as the main exposure.
>>
>>
>> . tab cpies_DNA_max, m
>>
>> cpies_DNA |
>>       _max |      Freq.     Percent        Cum.
>> ------------+-----------------------------------
>>          0 |        121       91.67       91.67
>>       2.85 |          1        0.76       92.42
>>      4.721 |          1        0.76       93.18
>>      5.059 |          1        0.76       93.94
>>      5.165 |          1        0.76       94.70
>>      6.267 |          1        0.76       95.45
>>      8.009 |          1        0.76       96.21
>>      9.965 |          1        0.76       96.97
>>     30.538 |          1        0.76       97.73
>>     35.137 |          1        0.76       98.48
>>         50 |          1        0.76       99.24
>>     71.227 |          1        0.76      100.00
>> ------------+-----------------------------------
>>      Total |        132      100.00
>>
>>
>> Censored values occur in enviromental, metabolomics, proteomics data
>> most commonly when the level of a biomarker in a sample is less than
>> the limit of quantification of the machine; these values are generally
>> reported as being less than detectable with the detection limit (DL)
>> being specified (for instances "< than 2.5").
>> There has been proposed several solutions like to replaces those
>> values with zeros, or DL, or DL/2 or a random value from a
>> distribution over the range from zero to DL. However, any of them have
>> been demonstrated to be optimal in simulation studies.
>> What I don't want is to eliminate those values and run the analysis on
>> complete cases.
>> Is it possible to use multiple imputation for replacing those values?
>> If this is an option, how can I tell the imputation method not to find
>> values bove the DL?
>> Is tobit an appropriate model for the fist analysis? because of marked
>> skewness, should I normalize the variable by transforming only the
>> values above DL?
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index