Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Population Attributable Fraction/punaf: Diagnostics for Wide CIs

From   Roger Newson <>
Subject   Re: st: Population Attributable Fraction/punaf: Diagnostics for Wide CIs
Date   Tue, 12 Jul 2011 11:22:38 +0100

Fat confidence intervals for PAFs are usually caused by fat confidence intervals for the corresponding population unattributable fraction (PUF), which in turn are caused by high standard errors for the linear combination of log odds and/or log odds ratios whose sampling variability determines the sampling variability of the PUF. These, in turn, may be caused by the fact that the kind of subjects in the fantasy scenarios invlved are not well reperesented in the real-world sample. For instance, if a sample contains very few non-smokers, then there will usually be a fat confidence interval for the difference that would be made to disease prevalences if the whole world stopped smoking.

I cannot comment any further on your case in particular, as I have not seen your data. However, a standard rule of thumb for logistic regression is to divide the number of events or non-events (whichever is smaller) by the number of parameters estimated, and check whether if this ratio is less than 5. If so, then you probably are trying to estimate too many parameters with your available data.

I hope this helps.

Best wishes


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Web page:
Departmental Web page:

Opinions expressed are those of the author, not of the institution.

On 12/07/2011 09:54, wrote:

I am using [punaf] to calculate the Population Attributable Fractions
(PAF) from a logistic regression, and have checked models with
Hosmer-L, AIC/BIC, and deltabeta residuals.

I am getting very broad CIs eg, 22% (4 to 37%) for multiple risk
factors and multiple related models.

Are there any methods to determine if these ranges are due to
inherently noisy survey data, or indicate simply inappropriate /
unstable models?

*   For searches and help try:
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index