Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: adjust prevalence (explained)


From   "Jan Brogger" <jan.brogger@med.uib.no>
To   "Statalist" <statalist@hsphsun2.harvard.edu>
Subject   st: adjust prevalence (explained)
Date   Wed, 28 Aug 2002 10:57:08 +0200

I recently posted on difficulties with adjust. After some
experimentation, I've come to the conclusion that -adjust- can't be
used. It is not a Stata problem, but a conceptual problem. This applies
to the common practice of "adjustment" of binary data with logistic
regression. Often, one wants to adjust prevalences to one particular
population (the target population).  In this post, I'll explain why
adjust can't be used. In the next post, I'll explain how I think it
should be done. Comments welcome.

-adjust <covar1> , by(<covar2>) pr - 
After logistic regression, this will make the prevalences of the outcome
by <covar2> comparable. They will not, however, refer to any
conceptually simple population.

It should be possible to make adjust refer to one particular population,
by using -Adjust <covar1>=<covar1_mean> <covar2=covar2_mean>,
by(<covar3>) pr - 

It seems intuitively correct that you should get the correct answers if
you adjust to the covariate means of the target population. However, it
is not. If the population you want to adjust to has 50% women, then
using -adjust sex=0.5 , by(time) - will not give the correct answers.

If you have a single binary covariate to adjust for, the required value
to adjust to is:  <covar1_mean> =
(ln(target_prev/(1-target_prev))-cons)/covar1_coeff
where you have to supply the target prevalence (the original prevalence
in the target population), and terms from the logistic regression.
AFAIK, this solution does not generalize to more than one predictor, and
thus has very limited scope.

The following code is an example. In code not shown here, I've tried
avoiding adjust, using predict, and even avoiding predict by computing
the logit's directly from the coefficients. It doesn't matter. The code
sets up a population with just a single time point. The original
prevalence is 25%. We will try to get this original prevalence back
after a logistic regression. 

clear
input sex asthma freq
0 0 45
0 1 5
1 0 30
1 1 20
end
expand freq
logit asthma sex , or
gen all=1
* This next -adjust- adjust to some strange population
adjust sex , by(all) pr
* This next -adjust- seems intuitive but doesn't work
adjust sex=0.5, by(all) pr
* This gives the original prevalence, from the formula above 
adjust sex=0.6131472, by(all) pr

Yours sincerely,

Jan Brogger, Institute of Medicine, University of Bergen, Norway

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index