# Re: st: Questions related to -predict-, -adjust-, and predictive margins

 From "Austin Nichols" To statalist@hsphsun2.harvard.edu Subject Re: st: Questions related to -predict-, -adjust-, and predictive margins Date Fri, 26 Sep 2008 07:58:25 -0400

```Steven, Michael, et al.--
The "predicted marginals" you describe are what most economists would
call marginal effects (as opposed to the approximation offered by
-mfx-); to get any variety you like, you can wrap calls to -predict-
inside a -program- and -bootstrap- the program (resampling clusters if
that is more appropriate than resampling observations) like so:
http://www.stata.com/statalist/archive/2008-03/msg00667.html

Note when E(Y|X)=F(Xb), where F() is nonlinear, e.g. the cumulative
normal for probit, or the inverse logit, etc.,
D E(y|X) / D x where D is capital delta, meaning "change in" and we
consider a change in x of one unit,
is only approximately equal to f(Xb)b for each observation
and the mean  D E(y|X) / D x over observations is approximately equal
to the mean of  f(Xb)b
or b times the mean of  f(Xb), but these are not equal to  f(E(X)b)b
which is what -mfx- reports.
I.e. you cannot pass the E operator inside a nonlinear f.

That -mfx- approach is like the marginal effect for a single
(imaginary) observation which is the "typical" individual in the data,
but no obs in the data may have a pattern of data anything like the
mean of X (think of all columns of X having extremely bimodal
distributions), so the mean of individual marginal effects seems more
intuitively appealing.

to see that Gelbach uses the delta method for SEs, not the bootstrap
as suggested above.

On Thu, Sep 25, 2008 at 7:30 PM, Steven Samuels
> Michael would like a standard error for the weighted average of
>           P(foreign|himpg =0, Z) - P(foreign | high mpg=1, Z)
> with other covariates at their original values Z. In SUDAAN parlance, the
> weighted average of the individual estimated probabilities is a "predicted
> marginal" and the difference is a contrast in the predicted marginals.
> (SUDAAN 8.0 Manual, p. 266).
>
> SUDAAN cam compute standard errors for the predicted marginals and their
> contrasts. Stata can compute the predicted marginals and contrasts, but not
> their standard errors.
>
> To compute the predicted marginals and their contrasts in Stata, run -svy:
> logit- . Then compute the individual predicted values: -adjust- will do it
> easily with the -pr- and -gen- options. Once individual predictions are at
> hand, -generate- the difference between any two. A call to -svy: mean- will
> compute the average of the predicted values (i.e. the "predicted marginals"
> and of the differences. However the standard error produced by -svy: mean-
> will not account for uncertainty in the estimated coefficients, and so will
> be too small.
>
> Despite this, it may be useful, and perhaps something is to be learned
> graphing the distribution of the differences for various groups, with
> -dotplot-.
>
> Michael can also get an idea of the magnitude of error in individual
> predictions by computing confidence intervals for them; he can do this by
> running -predict- after his -svy: logit-. If he generates the linear
> predictor -xb- and its standard error -stdp-, he can compute a CI for the
> linear predictor, then endpoints for the predicted probability itself. He
> could then plot a histogram of the length of these intervals. -predictnl-
> run after -svy: logit- can also directly compute the difference in
> probabilities and will also produce a standard error for these differences.
>
>
>
> -Steve
>
>
>
> On Sep 24, 2008, at 3:53 PM, Michael I. Lichter wrote:
>
>> Question 1: How do you calculate SEs for predicted probabilities for data
>> that require weights or are from a complex sample design? I've seen the FAQ
>> about how to do this in general, but I suspect that the FAQ's advice is not
>> correct for weighted data/data from complex samples.
>>
>> Question 2: -adjust, pr  ci- produces confidence intervals for
>> proportions. Is it not the case that SE = (UB - LB)/(2 * 1.96) given a 95%
>> confidence interval (assuming that weights/design are not a problem)?
>>
>> Question 3: I want to calculate predictive margins (predictions where
>> every element is treated as if it belonged to a given group, but otherwise
>> the elements' own values are used in the prediction), AND I want to be able
>> to test for equality of predicted proportions. From what I glean from an
>> recent article in NEJM, SUDAAN can do this, but I don't know how.
>>
>> Here is an example that goes partway there:
>>
>> . sysuse auto
>> . gen himpg = mpg > 25
>>
>> . logit foreign himpg weight
>>
>> ------------------------------------------------------------------------------
>>    foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>>      himpg |  -2.079449    .998357    -2.08   0.037    -4.036193
>> -.1227054
>>     weight |  -.0037159   .0009375    -3.96   0.000    -.0055534
>> -.0018785
>>      _cons |   9.795139   2.632037     3.72   0.000     4.636442
>>  14.95384
>>
>> ------------------------------------------------------------------------------
>>
>> . adjust himpg=0, pr ci
>>
>> --------------------------------------------------------------------------------
>>    Dependent variable: foreign     Command: logit
>>   Variable left as is: weight
>> Covariate set to value: himpg = 0
>>
>> --------------------------------------------------------------------------------
>> ----------------------------------------------
>>     All |         pr          lb          ub
>> ----------+-----------------------------------
>>         |    .193884    [.085888    .381067]
>> ----------------------------------------------
>>    Key:  pr         =  Probability
>>          [lb , ub]  =  [95% Confidence Interval]
>>
>> . adjust himpg=1, pr ci
>>
>>
>> --------------------------------------------------------------------------------
>>    Dependent variable: foreign     Command: logit
>>   Variable left as is: weight
>> Covariate set to value: himpg = 1
>>
>> --------------------------------------------------------------------------------
>> ----------------------------------------------
>>     All |         pr          lb          ub
>> ----------+-----------------------------------
>>         |    .029187    [.003519    .203809]
>> ----------------------------------------------
>>    Key:  pr         =  Probability
>>          [lb , ub]  =  [95% Confidence Interval]
>>
>>
>> What can I say about the relationship between the predictions (aside from
>> the obvious facts that they seem to be very different but their CIs are wide
>> and overlap)?
>>
>>     All |         pr          lb          ub
>> ----------+-----------------------------------
>>         |    .193884    [.085888    .381067]
>>         |    .029187    [.003519    .203809]
>> ----------------------------------------------
>>    Key:  pr         =  Probability
>>          [lb , ub]  =  [95% Confidence Interval]
>> Thanks.
>>
>> Michael
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```