[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: model-based standardization

From   "Garth Rauscher" <>
To   <>
Subject   RE: st: model-based standardization
Date   Tue, 5 Feb 2008 01:28:34 -0600

Austin-problem solved, just like that. Thanks for the solution-Garth.  

Garth Rauscher
Division of Epid/Bios (M/C 923)
UIC School of Public Health
1603 West Taylor Street
Chicago, IL 60612
ph: (312)413-4317
fx:  (312)996-0064

-----Original Message-----
[] On Behalf Of Austin Nichols
Sent: Monday, February 04, 2008 11:30 PM
Subject: Re: st: model-based standardization

I haven't looked at the ref you cite, but what you want is one of the
several quantities that economists refer to as marginal effects (see e.g.
the description of -margfx- and SE calcs at

Your code fails because you have not multiplied estimated coefficients by
Try instead e.g.
logit Y X r2 r3 a2 a3
g p0=invlogit(_b[_cons]+ _b[r2]*r2+_b[r3]*r3+_b[a2]*a2+_b[a3]*a3)
g p1=invlogit(_b[_cons]+_b[X]+_b[r2]*r2+_b[r3]*r3+_b[a2]*a2+_b[a3]*a3)

but it's easier to use predict in any case:
ren X wasX
g X=0
predict pr0
replace X=1
predict pr1
drop X
ren wasX X

though you still have to make sure there are no neglected connections
between X and other variables (e.g. interactions).

To bootstrap, simply wrap it in a program:

prog dp
cap drop pr0 pr1 dp
logit Y X r2 r3 a2 a3
ren X wasX
g X=0
predict pr0
replace X=1
predict pr1
drop X
ren wasX X
g dp=pr1-pr0
mean pr0 pr1 dp
bs: dp

On Feb 4, 2008 11:11 PM, Garth Rauscher <> wrote:
> [I tried to send this message to the listserv a few days ago but don't 
> think it made it through so I am trying again. I apologize if this is 
> a duplicate message.]
> Dear listserve members
> I am attempting to learn how to perform a model-based standardization 
> with Stata, using the marginal or predictive margins method.  I would 
> like to be able to estimate standardized probabilities and probability 
> differences from logistic regression that are standardized to the 
> distribution of modeled covariates. The idea is summarized in: 
> "Greenland S. Model-based estimation of relative risks and other 
> epidemiologic measures in studies of common outcomes and in 
> case-control studies. Am J Epidemiol 2004;160:301-305." To the best of 
> my understanding, the method involves estimating predictied 
> probabilities of Y under two scenarios (e.g. x=1 and x=0). Assuming we 
> have a dependent variable Y(0,1), an exposure of interest X(0,1), and 
> covariates
> r2 r3 a2 a3 a4 a5, two sets of predicted probabiltiies could be:
> P0(x) based on the joint distribution of covariates, with X=0 assigned 
> to everyone
> P1(x) based on the joint distribution of covariates, with X=1 assigned 
> to everyone
> PD(x) as the difference in probabilities,  P1(x) - P0(x)
> Below is my code.
> logit Y X r2 r3 a2 a3 a4 a5
> // predicted xbetas after assigning all observations to X=0 g 
> if0=_b[_cons]+_b[X]*0+_b[r2]+_b[r3]+_b[a2]+_b[a3]+_b[a4]+_b[a5]
> // predicted xbetas after assigning all observations to X=1 g 
> if1=_b[_cons]+_b[X]*1+_b[r2]+_b[r3]+_b[a2]+_b[a3]+_b[a4]+_b[a5]
> // predicted probabilities
>       g p0x = invlogit(if0)
>       g p1x = invlogit(if1)
> I was expecting two new variables of predictied probabilities, p0x and 
> p1x with a range of values that depended on covariates. However, I 
> noticed that p0x and p1x each had only one value instead of a range of 
> values as I had expected (see above).  Any clarification as to what I 
> am doing incorrectly would be appreciated. I think my next task would 
> have been to perform bootstrapping to get confidence intervals from 
> the distribution of means for p0x, p1x and PD(x).
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index