Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Questions related to -predict-, -adjust-, and predictive margins


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Questions related to -predict-, -adjust-, and predictive margins
Date   Fri, 26 Sep 2008 07:58:25 -0400

Steven, Michael, et al.--
The "predicted marginals" you describe are what most economists would
call marginal effects (as opposed to the approximation offered by
-mfx-); to get any variety you like, you can wrap calls to -predict-
inside a -program- and -bootstrap- the program (resampling clusters if
that is more appropriate than resampling observations) like so:
http://www.stata.com/statalist/archive/2008-03/msg00667.html

Note when E(Y|X)=F(Xb), where F() is nonlinear, e.g. the cumulative
normal for probit, or the inverse logit, etc.,
 D E(y|X) / D x where D is capital delta, meaning "change in" and we
consider a change in x of one unit,
is only approximately equal to f(Xb)b for each observation
and the mean  D E(y|X) / D x over observations is approximately equal
to the mean of  f(Xb)b
or b times the mean of  f(Xb), but these are not equal to  f(E(X)b)b
which is what -mfx- reports.
I.e. you cannot pass the E operator inside a nonlinear f.

That -mfx- approach is like the marginal effect for a single
(imaginary) observation which is the "typical" individual in the data,
but no obs in the data may have a pattern of data anything like the
mean of X (think of all columns of X having extremely bimodal
distributions), so the mean of individual marginal effects seems more
intuitively appealing.

See also http://terpconnect.umd.edu/~gelbach/ado/ and the file
margfx.ado plus discussion in
http://terpconnect.umd.edu/~gelbach/ado/margfx.pdf
to see that Gelbach uses the delta method for SEs, not the bootstrap
as suggested above.

On Thu, Sep 25, 2008 at 7:30 PM, Steven Samuels
<sjhsamuels@earthlink.net> wrote:
> Michael would like a standard error for the weighted average of
>           P(foreign|himpg =0, Z) - P(foreign | high mpg=1, Z)
> with other covariates at their original values Z. In SUDAAN parlance, the
> weighted average of the individual estimated probabilities is a "predicted
> marginal" and the difference is a contrast in the predicted marginals.
> (SUDAAN 8.0 Manual, p. 266).
>
> SUDAAN cam compute standard errors for the predicted marginals and their
> contrasts. Stata can compute the predicted marginals and contrasts, but not
> their standard errors.
>
> To compute the predicted marginals and their contrasts in Stata, run -svy:
> logit- . Then compute the individual predicted values: -adjust- will do it
> easily with the -pr- and -gen- options. Once individual predictions are at
> hand, -generate- the difference between any two. A call to -svy: mean- will
> compute the average of the predicted values (i.e. the "predicted marginals"
> and of the differences. However the standard error produced by -svy: mean-
> will not account for uncertainty in the estimated coefficients, and so will
> be too small.
>
> Despite this, it may be useful, and perhaps something is to be learned
> graphing the distribution of the differences for various groups, with
> -dotplot-.
>
> Michael can also get an idea of the magnitude of error in individual
> predictions by computing confidence intervals for them; he can do this by
> running -predict- after his -svy: logit-. If he generates the linear
> predictor -xb- and its standard error -stdp-, he can compute a CI for the
> linear predictor, then endpoints for the predicted probability itself. He
> could then plot a histogram of the length of these intervals. -predictnl-
> run after -svy: logit- can also directly compute the difference in
> probabilities and will also produce a standard error for these differences.
>
>
>
> -Steve
>
>
>
> On Sep 24, 2008, at 3:53 PM, Michael I. Lichter wrote:
>
>> Question 1: How do you calculate SEs for predicted probabilities for data
>> that require weights or are from a complex sample design? I've seen the FAQ
>> about how to do this in general, but I suspect that the FAQ's advice is not
>> correct for weighted data/data from complex samples.
>>
>> Question 2: -adjust, pr  ci- produces confidence intervals for
>> proportions. Is it not the case that SE = (UB - LB)/(2 * 1.96) given a 95%
>> confidence interval (assuming that weights/design are not a problem)?
>>
>> Question 3: I want to calculate predictive margins (predictions where
>> every element is treated as if it belonged to a given group, but otherwise
>> the elements' own values are used in the prediction), AND I want to be able
>> to test for equality of predicted proportions. From what I glean from an
>> recent article in NEJM, SUDAAN can do this, but I don't know how.
>>
>> Here is an example that goes partway there:
>>
>> . sysuse auto
>> . gen himpg = mpg > 25
>>
>> . logit foreign himpg weight
>>
>> ------------------------------------------------------------------------------
>>    foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>>      himpg |  -2.079449    .998357    -2.08   0.037    -4.036193
>> -.1227054
>>     weight |  -.0037159   .0009375    -3.96   0.000    -.0055534
>> -.0018785
>>      _cons |   9.795139   2.632037     3.72   0.000     4.636442
>>  14.95384
>>
>> ------------------------------------------------------------------------------
>>
>> . adjust himpg=0, pr ci
>>
>> --------------------------------------------------------------------------------
>>    Dependent variable: foreign     Command: logit
>>   Variable left as is: weight
>> Covariate set to value: himpg = 0
>>
>> --------------------------------------------------------------------------------
>> ----------------------------------------------
>>     All |         pr          lb          ub
>> ----------+-----------------------------------
>>         |    .193884    [.085888    .381067]
>> ----------------------------------------------
>>    Key:  pr         =  Probability
>>          [lb , ub]  =  [95% Confidence Interval]
>>
>> . adjust himpg=1, pr ci
>>
>>
>> --------------------------------------------------------------------------------
>>    Dependent variable: foreign     Command: logit
>>   Variable left as is: weight
>> Covariate set to value: himpg = 1
>>
>> --------------------------------------------------------------------------------
>> ----------------------------------------------
>>     All |         pr          lb          ub
>> ----------+-----------------------------------
>>         |    .029187    [.003519    .203809]
>> ----------------------------------------------
>>    Key:  pr         =  Probability
>>          [lb , ub]  =  [95% Confidence Interval]
>>
>>
>> What can I say about the relationship between the predictions (aside from
>> the obvious facts that they seem to be very different but their CIs are wide
>> and overlap)?
>>
>>     All |         pr          lb          ub
>> ----------+-----------------------------------
>>         |    .193884    [.085888    .381067]
>>         |    .029187    [.003519    .203809]
>> ----------------------------------------------
>>    Key:  pr         =  Probability
>>          [lb , ub]  =  [95% Confidence Interval]
>> Thanks.
>>
>> Michael
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index