[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: model-based standardization

From   "Garth Rauscher" <>
To   <>
Subject   st: model-based standardization
Date   Mon, 4 Feb 2008 22:11:39 -0600

[I tried to send this message to the listserv a few days ago but don't think
it made it through so I am trying again. I apologize if this is a duplicate

Dear listserve members

I am attempting to learn how to perform a model-based standardization with
Stata, using the marginal or predictive margins method.  I would like to be
able to estimate standardized probabilities and probability differences from
logistic regression that are standardized to the distribution of modeled
covariates. The idea is summarized in: "Greenland S. Model-based estimation
of relative risks and other epidemiologic measures in studies of common
outcomes and in case-control studies. Am J Epidemiol 2004;160:301-305." To
the best of my understanding, the method involves estimating predictied
probabilities of Y under two scenarios (e.g. x=1 and x=0). Assuming we have
a dependent variable Y(0,1), an exposure of interest X(0,1), and covariates
r2 r3 a2 a3 a4 a5, two sets of predicted probabiltiies could be:
P0(x) based on the joint distribution of covariates, with X=0 assigned to
P1(x) based on the joint distribution of covariates, with X=1 assigned to
PD(x) as the difference in probabilities,  P1(x) - P0(x)

Below is my code. 


logit Y X r2 r3 a2 a3 a4 a5

// predicted xbetas after assigning all observations to X=0

g if0=_b[_cons]+_b[X]*0+_b[r2]+_b[r3]+_b[a2]+_b[a3]+_b[a4]+_b[a5] 


// predicted xbetas after assigning all observations to X=1

g if1=_b[_cons]+_b[X]*1+_b[r2]+_b[r3]+_b[a2]+_b[a3]+_b[a4]+_b[a5]


// predicted probabilities

      g p0x = invlogit(if0)
      g p1x = invlogit(if1)


groups p0x p1x 


  |      p0x        p1x   Freq.   Percent |
  | .1368349   .4431594     444    100.00 |

I was expecting two new variables of predictied probabilities, p0x and p1x
with a range of values that depended on covariates. However, I noticed that
p0x and p1x each had only one value instead of a range of values as I had
expected (see above).  Any clarification as to what I am doing incorrectly
would be appreciated. I think my next task would have been to perform
bootstrapping to get confidence intervals from the distribution of means for
p0x, p1x and PD(x). 

Thanks in advance. Garth

Garth Rauscher
Division of Epid/Bios (M/C 923)
UIC School of Public Health
1603 West Taylor Street
Chicago, IL 60612
ph: (312)413-4317
fx:  (312)996-0064



*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index