Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: R: st: Population attributable fractions (PAFs) in discrete-time survival analysis. -punaf-

From	Angelo Belardi <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: R: st: Population attributable fractions (PAFs) in discrete-time survival analysis. -punaf-
Date	Fri, 16 Aug 2013 12:26:01 +0100
Hello Roger,

Thanks for your interpretation of the problem. However, it seems not
to solve the issue.

Together with my colleagues I looked through the output of our main
analyses. None of our linear predictors are very large and negative,
which you described as a possible cause of this error message.

We still assume that the problem may be connected to the use of the
-noconstant- option in the -cloglog- functions, because -punaf- seems
to work fine if this option is not needed. We think that -noconstant-
may alter the output from -cloglog- in a way which makes it
incompatible with the -punaf- function.
However, I don't know how -punaf- reads in the output of -cloglog-
exactly or how the structure of the stored values get changed by
-noconstant-.

Best regards,
Angelo


Angelo Belardi
Ambizione research group (SNSF)
Department of Clinical Psychology and Psychiatry
University of Basel
Missionsstrasse 60/62
CH-4055 Basel, Switzerland
Email: [email protected]



2013/8/5 Roger B. Newson <[email protected]>
>
> Hello Angelo
>
> What appears to be happening here is that one or other of your scenario prevalences is being evaluated to a non-positive quantity (zero or even negative). The scenario prevalence for a -cloglog- model is a mean of predicted values of the form
>
> 1-exp(-exp(z))
>
> where z is the linear predictor of a -cloglog- model (ie the sum of the beta*X terms). For this to be non-positive, exp(-exp(z)) must be at least 1, implying that -exp(z) must be non-negative, implying that exp(z) seems to be evaluating to zero. This will happen if z (the linear predictor) is very large and negative. And, presumably, this can happen if one or more of your fitted betas "converges" to plus or minus infinity.
>
> I do not know how you have estimated extremely large negative linear predictors. And I do not know what a "fully non-parametric baseline hazard function" is, in a -cloglog- model. However, that is what appears to be happening, for whatever reason.
>
> If you are fitting a binomial model where fitted values (ie fitted binomial proportions) may sometimes be zero, then you should possibly be measuring scenario differences (using -regpar-) instead of scenario ratios (using -punaf-). However, I am not sure exactly what question you are trying to answer.
>
>
> I hope this helps.
>
> Best wishes
>
> Roger
>
> Roger B Newson BSc MSc DPhil
> Lecturer in Medical Statistics
> Respiratory Epidemiology and Public Health Group
> National Heart and Lung Institute
> Imperial College London
> Royal Brompton Campus
> Room 33, Emmanuel Kaye Building
> 1B Manresa Road
> London SW3 6LR
> UNITED KINGDOM
> Tel: +44 (0)20 7352 8121 ext 3381
> Fax: +44 (0)20 7351 8322
> Email: [email protected]
> Web page: http://www.imperial.ac.uk/nhli/r.newson/
> Departmental Web page:
> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
>
> Opinions expressed are those of the author, not of the institution.
>
> On 05/08/2013 10:33, Angelo Belardi wrote:
>>
>> Thanks again for your precise answers.
>>
>> I have now tried to run -punaf- after my -cloglog- analyses. However,
>> -punaf- encountered a problem.
>> The error message that comes up is: "expression (log(_b[_cons]))
>> evaluates to missing".
>>
>> I assume that this might be connected to my use of the -noconstant-
>> option in the -cloglog- commands. For analyses where the calculations
>> are possible to run without the -nocons- option, -punaf- also gives me
>> reasonable results and no error message.
>> However, from what I know I have to use this option because of my
>> fully non-parametric baseline hazard function.
>>
>> Is it possible that -punaf- has a problem with that or might the error
>> be due to something else? How could I solve this issue?
>>
>> Best regards,
>> Angelo
>>
>>
>>
>> Angelo Belardi
>> Ambizione research group (SNSF)
>> Department of Clinical Psychology and Psychiatry
>> University of Basel
>> Missionsstrasse 60/62
>> CH-4055 Basel, Switzerland
>> Email: [email protected]
>>
>>
>> 2013/7/21 Roger B. Newson <[email protected]>
>>
>>> In reply to Angelo's queries:
>>>
>>> A. You can indeed use -punaf- after -cloglog-. (Or you should be able to do so - let me know if you have any problems.) However, the interpretation of the attributable and unattributable fractions will then be similar to the interpretation of these parameters when you use -punaf- after -logit- or -logistic-. It is probably not a good idea to use -punafcc- after -cloglog-. And -punafcc- should probably not be used after -logit- or -logistic-, except if your data are from a case-control study (for which -punafcc- was written). After a Cox regression, you may use either -punaf- or -punafcc-, depending on what kind of population unattributable and attributable fractions you wanted to estimate (ie my kind or Samuelson and Eider's kind).
>>>
>>> B. If you are working with a dataset with 1 observation per person per period, and the outcome variable is binary, then you should use an estimation command that allows for the clustering of person-periods by persons. For instance, you might use -xtgee-, or you might use -logit-, -logistic-, or -cloglog- with an option like -vce(cluster person)-. The interpretation of the population unattributable and attributable fractions will then be the same as when -punaf- is used after binary data. That is to say, the PAF (or PUF) will be the fraction of the binary outcomes equal to 1 that is attributable (or unattributable) to living in Scenario 0 instead of Scenario 1.
>>>
>>> C. The WHO definition of a PAF is an extremely simple special case of the -punaf- definition of a PAF, for the special case of a binary outcome variable, a discrete-valued exposure variable with n levels, and no concomitant (or confounder) variables. And the WHO also assumes that "Scenario 0" is the real world that we live in, and that "Scenario 1" is a user-specified ideal scenario (eg a dream scenario where the whole world stopped smoking, or a dream scenario where the current smokers become ex-smokers, or a more realistic dream scenario where only a proportion of the current smokers quit smoking). The P_i specified by the WHO are the proportions of the population at the i'th exposure level in the real world (Scenario 0). And the P'_i are the proportions of the population that would have the i'th exposure level in the dream scenario (Scenario 1). And the RR_i are the relative risks (ie rate ratios) associated with the comparing the ith exposure level to the lowest expo!
 su
>
> !
>
>>   re level. So, the -punaf- definition is a generalization of the WHO definition. There seems to be some controversy about how best to generalize the concept of a PAF (or a PUF) to the case of a Cox regression. (At least, I had a different idea from Samuelson and Eide.)
>>>
>>>
>>>
>>> I hope this helps.
>>>
>>> Best wishes
>>>
>>> Roger
>>>
>>> Roger B Newson BSc MSc DPhil
>>> Lecturer in Medical Statistics
>>> Respiratory Epidemiology and Public Health Group
>>> National Heart and Lung Institute
>>> Imperial College London
>>> Royal Brompton Campus
>>> Room 33, Emmanuel Kaye Building
>>> 1B Manresa Road
>>> London SW3 6LR
>>> UNITED KINGDOM
>>> Tel: +44 (0)20 7352 8121 ext 3381
>>> Fax: +44 (0)20 7351 8322
>>> Email: [email protected]
>>> Web page: http://www.imperial.ac.uk/nhli/r.newson/
>>> Departmental Web page:
>>> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
>>>
>>> Opinions expressed are those of the author, not of the institution.
>>>
>>> On 16/07/2013 23:20, Angelo Belardi wrote:
>>>>
>>>>
>>>> Roger, thanks a lot for the detailed answers and all the effort.
>>>>
>>>> After a discussion with my colleagues, I have a few follow-up
>>>> questions on the subject:
>>>>
>>>> A:  In your last reply you spoke about Cox regression. Would these
>>>> statements also apply to hazard models with a
>>>> non-parametric baseline hazard function (using -cloglog-)?
>>>>
>>>> B: We work with person-period formatted datasets we got from
>>>> reorganising our initial data. Does that have an influence on the
>>>> results we get out of -punaf- or can the results be interpreted
>>>> similarly?
>>>>
>>>> C: How would the resulting AHFs have to be interpreted? Are they
>>>> time-independent as suggested by Samuelsen and Eide (2008) in their
>>>> Equation 4? And could these be interpreted in line with the WHO
>>>> definition of PAFs, as a "proportional reduction in the hazard ratio"?
>>>>
>>>>
>>>> Best regards and thanks already for any further help
>>>> Angelo
>>>>
>>>>
>>>> References:
>>>> - Sven Ove Samuelsen and Geir Egil Eide. 2008. Attributable fractions with
>>>> survival data. Statistics in Medicine 2008; 27:1447–1467.
>>>> http://onlinelibrary.wiley.com/doi/10.1002/sim.3022/abstract
>>>> - WHO definition of population attributable fraction,
>>>> http://www.who.int/healthinfo/global_burden_disease/metrics_paf/en/index.html
>>>>
>>>>
>>>>
>>>> Angelo Belardi
>>>> Ambizione research group (SNSF)
>>>> Department of Clinical Psychology and Psychiatry
>>>> University of Basel
>>>> Missionsstrasse 60/62
>>>> CH-4055 Basel, Switzerland
>>>> Email: [email protected]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2013/7/1 Roger B. Newson <[email protected]>
>>>>>
>>>>>
>>>>>
>>>>> PS I have had a look at the Sauelsen and Eide paper, and would like to make a minor correction. The AHF of Equation 4 looks like the PAF that you would get by using -punaf- after a Cox regression, and is equal (in their notation) to
>>>>>
>>>>> AHF = 1 - E[exp(beta'Z*)]/E[exp(beta'Z)]
>>>>>
>>>>> where Z is the covariate vector in the real-world scenario, and Z* is the covariate vector in the fantasy-intervention scenario. If you use -punafcc- after a Cox regression, then you should instead get
>>>>>
>>>>> PAF = 1 - E[exp(beta'Z*)/exp(beta'Z)]
>>>>>
>>>>> which is not exactly the same thing. However, whichever formula we use, we should probably use the option -vce(unconditional)- if we use it after a Cox regression, because the covariates at the time of each death are subject to sampling error.
>>>>>
>>>>>
>>>>> Best wishes
>>>>>
>>>>> Roger
>>>>>
>>>>> Roger B Newson BSc MSc DPhil
>>>>> Lecturer in Medical Statistics
>>>>> Respiratory Epidemiology and Public Health Group
>>>>> National Heart and Lung Institute
>>>>> Imperial College London
>>>>> Royal Brompton Campus
>>>>> Room 33, Emmanuel Kaye Building
>>>>> 1B Manresa Road
>>>>> London SW3 6LR
>>>>> UNITED KINGDOM
>>>>> Tel: +44 (0)20 7352 8121 ext 3381
>>>>> Fax: +44 (0)20 7351 8322
>>>>> Email: [email protected]
>>>>> Web page: http://www.imperial.ac.uk/nhli/r.newson/
>>>>> Departmental Web page:
>>>>> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
>>>>>
>>>>> Opinions expressed are those of the author, not of the institution.
>>>>>
>>>>> On 01/07/2013 13:09, Roger B. Newson wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks to Carlo for this reference. Yes, the attributable hazard
>>>>>> fraction (AHF) in Equation (4) of Samuelsen and Eide (2008) is the same
>>>>>> as the population attributable fraction (PAF) produced by -punafcc-
>>>>>> after using -stcox-. The confidence interval formulas are a little
>>>>>> different. Samuelson and Eide use the percentile bootstrap, whereas the
>>>>>> online help for -punafcc- recommends the user to use Shah variances by
>>>>>> specifying the option -vce(unconditional)-. You could presumably write a
>>>>>> program to use the percentile bootstrap with -punafcc-, though.
>>>>>>
>>>>>> Best wishes
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>> References
>>>>>>
>>>>>> Sven Ove Samuelsen and Geir Egil Eide. 2008. Attributable fractions with
>>>>>> survival data. Statistics in Medicine 2008; 27:1447–1467.
>>>>>>
>>>>>> Roger B Newson BSc MSc DPhil
>>>>>> Lecturer in Medical Statistics
>>>>>> Respiratory Epidemiology and Public Health Group
>>>>>> National Heart and Lung Institute
>>>>>> Imperial College London
>>>>>> Royal Brompton Campus
>>>>>> Room 33, Emmanuel Kaye Building
>>>>>> 1B Manresa Road
>>>>>> London SW3 6LR
>>>>>> UNITED KINGDOM
>>>>>> Tel: +44 (0)20 7352 8121 ext 3381
>>>>>> Fax: +44 (0)20 7351 8322
>>>>>> Email: [email protected]
>>>>>> Web page: http://www.imperial.ac.uk/nhli/r.newson/
>>>>>> Departmental Web page:
>>>>>> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
>>>>>>
>>>>>>
>>>>>> Opinions expressed are those of the author, not of the institution.
>>>>>>
>>>>>> On 01/07/2013 12:21, Carlo Lazzaro wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I suppose that Angelo refers to the following reference (access to the
>>>>>>> full
>>>>>>> text conditional on subscription to Stat Med):
>>>>>>>
>>>>>>> Samuelsen SO, Eide GE. Attributable fractions with survival data. Stat
>>>>>>> Med.
>>>>>>> 2008 Apr 30;27(9):1447-67.
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Carlo
>>>>>>> -----Messaggio originale-----
>>>>>>> Da: [email protected]
>>>>>>> [mailto:[email protected]] Per conto di Roger B.
>>>>>>> Newson
>>>>>>> Inviato: lunedì 1 luglio 2013 12:57
>>>>>>> A: [email protected]
>>>>>>> Oggetto: Re: st: Population attributable fractions (PAFs) in
>>>>>>> discrete-time
>>>>>>> survival analysis. -punaf-
>>>>>>>
>>>>>>> Yes, you can use -punaf- after a generalized linear model (GLM) with a
>>>>>>> complementary log-log link and a binomial error function. Or after any
>>>>>>> other
>>>>>>> GLM that gives positive-valued conditional expectations (which includes
>>>>>>> proportions and also Gamma and inverse-Gaussian means).
>>>>>>>
>>>>>>> For proportional-hazard models (and also for case-control data), there
>>>>>>> is a
>>>>>>> package -punafcc-, which you can also download from SSC, and which
>>>>>>> estimates
>>>>>>> population attributable hazard factions (after proportional-hazard
>>>>>>> regressions), or population attributable fractions (after logit
>>>>>>> regressions
>>>>>>> on case-control data).
>>>>>>>
>>>>>>> Angelo has not given the Samuelsen & Eide (2008) reference on PAHFs in
>>>>>>> full.
>>>>>>> However, I would guess that the PAHFs of that reference would be
>>>>>>> either the
>>>>>>> same as, or similar to, those produced by -punafcc-. I would very much
>>>>>>> like
>>>>>>> to know the full reference, so I can read it and find out more.
>>>>>>>
>>>>>>> I hope this helps.
>>>>>>>
>>>>>>> Best wishes
>>>>>>>
>>>>>>> Roger
>>>>>>>
>>>>>>> Roger B Newson BSc MSc DPhil
>>>>>>> Lecturer in Medical Statistics
>>>>>>> Respiratory Epidemiology and Public Health Group National Heart and Lung
>>>>>>> Institute Imperial College London Royal Brompton Campus Room 33, Emmanuel
>>>>>>> Kaye Building 1B Manresa Road London SW3 6LR UNITED KINGDOM
>>>>>>> Tel: +44 (0)20 7352 8121 ext 3381
>>>>>>> Fax: +44 (0)20 7351 8322
>>>>>>> Email: [email protected]
>>>>>>> Web page: http://www.imperial.ac.uk/nhli/r.newson/
>>>>>>> Departmental Web page:
>>>>>>> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgene
>>>>>>>
>>>>>>> tics/reph/
>>>>>>>
>>>>>>> Opinions expressed are those of the author, not of the institution.
>>>>>>>
>>>>>>> On 01/07/2013 00:13, Angelo Belardi wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Dear All,
>>>>>>>>
>>>>>>>> I am working on discrete-time proportional hazard models with a
>>>>>>>> non-parametric baseline hazard function, using -cloglog- in
>>>>>>>> person-period formatted datasets.
>>>>>>>>
>>>>>>>> I would like to additionally calculate population attributable
>>>>>>>> fractions (PAFs) in these models.
>>>>>>>> However, I have never worked with PAFs in survival analyses before and
>>>>>>>> therefore don't know which functions to use and how to correctly
>>>>>>>> interpret the results.
>>>>>>>>
>>>>>>>> Previously, I calculated PAFs in STATA with the -punaf- package from
>>>>>>>> Roger Newson, e.g.
>>>>>>>> for logistic regressions.
>>>>>>>>
>>>>>>>> Can I use -punaf- here as well, just after calculating the estimates
>>>>>>>> over -cloglog-?
>>>>>>>>
>>>>>>>> Or is there another function/package for this situation?
>>>>>>>>
>>>>>>>> Or would it be better to calculate population attributable hazard
>>>>>>>> fractions (PAHFs) as described in Samuelsen & Eide (2008)?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for any help or advice on the subject.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Angelo
>>>>>>>>
>>>>>>>>
>>>>>>>> Ref:
>>>>>>>> S. O. Samuelsen, G. E. Eide, Statist. Med. 27, 1447 (2008).
>>>>>>>> http://onlinelibrary.wiley.com/doi/10.1002/sim.3022/abstract
>>>>
>>>>
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: R: st: Population attributable fractions (PAFs) in discrete-time survival analysis. -punaf-
  - From: "Roger B. Newson" <[email protected]>
Prev by Date: Re: st: factor variables may not contain negative values
Next by Date: Re: st: factor variables may not contain negative values
Previous by thread: Re: R: st: Population attributable fractions (PAFs) in discrete-time survival analysis. -punaf-
Next by thread: Re: R: st: Population attributable fractions (PAFs) in discrete-time survival analysis. -punaf-
Index(es):
- Date
- Thread