[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Questions related to -predict-, -adjust-, and predictive margins |

Date |
Fri, 26 Sep 2008 07:58:25 -0400 |

Steven, Michael, et al.-- The "predicted marginals" you describe are what most economists would call marginal effects (as opposed to the approximation offered by -mfx-); to get any variety you like, you can wrap calls to -predict- inside a -program- and -bootstrap- the program (resampling clusters if that is more appropriate than resampling observations) like so: http://www.stata.com/statalist/archive/2008-03/msg00667.html Note when E(Y|X)=F(Xb), where F() is nonlinear, e.g. the cumulative normal for probit, or the inverse logit, etc., D E(y|X) / D x where D is capital delta, meaning "change in" and we consider a change in x of one unit, is only approximately equal to f(Xb)b for each observation and the mean D E(y|X) / D x over observations is approximately equal to the mean of f(Xb)b or b times the mean of f(Xb), but these are not equal to f(E(X)b)b which is what -mfx- reports. I.e. you cannot pass the E operator inside a nonlinear f. That -mfx- approach is like the marginal effect for a single (imaginary) observation which is the "typical" individual in the data, but no obs in the data may have a pattern of data anything like the mean of X (think of all columns of X having extremely bimodal distributions), so the mean of individual marginal effects seems more intuitively appealing. See also http://terpconnect.umd.edu/~gelbach/ado/ and the file margfx.ado plus discussion in http://terpconnect.umd.edu/~gelbach/ado/margfx.pdf to see that Gelbach uses the delta method for SEs, not the bootstrap as suggested above. On Thu, Sep 25, 2008 at 7:30 PM, Steven Samuels <sjhsamuels@earthlink.net> wrote: > Michael would like a standard error for the weighted average of > P(foreign|himpg =0, Z) - P(foreign | high mpg=1, Z) > with other covariates at their original values Z. In SUDAAN parlance, the > weighted average of the individual estimated probabilities is a "predicted > marginal" and the difference is a contrast in the predicted marginals. > (SUDAAN 8.0 Manual, p. 266). > > SUDAAN cam compute standard errors for the predicted marginals and their > contrasts. Stata can compute the predicted marginals and contrasts, but not > their standard errors. > > To compute the predicted marginals and their contrasts in Stata, run -svy: > logit- . Then compute the individual predicted values: -adjust- will do it > easily with the -pr- and -gen- options. Once individual predictions are at > hand, -generate- the difference between any two. A call to -svy: mean- will > compute the average of the predicted values (i.e. the "predicted marginals" > and of the differences. However the standard error produced by -svy: mean- > will not account for uncertainty in the estimated coefficients, and so will > be too small. > > Despite this, it may be useful, and perhaps something is to be learned > graphing the distribution of the differences for various groups, with > -dotplot-. > > Michael can also get an idea of the magnitude of error in individual > predictions by computing confidence intervals for them; he can do this by > running -predict- after his -svy: logit-. If he generates the linear > predictor -xb- and its standard error -stdp-, he can compute a CI for the > linear predictor, then endpoints for the predicted probability itself. He > could then plot a histogram of the length of these intervals. -predictnl- > run after -svy: logit- can also directly compute the difference in > probabilities and will also produce a standard error for these differences. > > > > -Steve > > > > On Sep 24, 2008, at 3:53 PM, Michael I. Lichter wrote: > >> Question 1: How do you calculate SEs for predicted probabilities for data >> that require weights or are from a complex sample design? I've seen the FAQ >> about how to do this in general, but I suspect that the FAQ's advice is not >> correct for weighted data/data from complex samples. >> >> Question 2: -adjust, pr ci- produces confidence intervals for >> proportions. Is it not the case that SE = (UB - LB)/(2 * 1.96) given a 95% >> confidence interval (assuming that weights/design are not a problem)? >> >> Question 3: I want to calculate predictive margins (predictions where >> every element is treated as if it belonged to a given group, but otherwise >> the elements' own values are used in the prediction), AND I want to be able >> to test for equality of predicted proportions. From what I glean from an >> recent article in NEJM, SUDAAN can do this, but I don't know how. >> >> Here is an example that goes partway there: >> >> . sysuse auto >> . gen himpg = mpg > 25 >> >> . logit foreign himpg weight >> >> ------------------------------------------------------------------------------ >> foreign | Coef. Std. Err. z P>|z| [95% Conf. >> Interval] >> >> -------------+---------------------------------------------------------------- >> himpg | -2.079449 .998357 -2.08 0.037 -4.036193 >> -.1227054 >> weight | -.0037159 .0009375 -3.96 0.000 -.0055534 >> -.0018785 >> _cons | 9.795139 2.632037 3.72 0.000 4.636442 >> 14.95384 >> >> ------------------------------------------------------------------------------ >> >> . adjust himpg=0, pr ci >> >> -------------------------------------------------------------------------------- >> Dependent variable: foreign Command: logit >> Variable left as is: weight >> Covariate set to value: himpg = 0 >> >> -------------------------------------------------------------------------------- >> ---------------------------------------------- >> All | pr lb ub >> ----------+----------------------------------- >> | .193884 [.085888 .381067] >> ---------------------------------------------- >> Key: pr = Probability >> [lb , ub] = [95% Confidence Interval] >> >> . adjust himpg=1, pr ci >> >> >> -------------------------------------------------------------------------------- >> Dependent variable: foreign Command: logit >> Variable left as is: weight >> Covariate set to value: himpg = 1 >> >> -------------------------------------------------------------------------------- >> ---------------------------------------------- >> All | pr lb ub >> ----------+----------------------------------- >> | .029187 [.003519 .203809] >> ---------------------------------------------- >> Key: pr = Probability >> [lb , ub] = [95% Confidence Interval] >> >> >> What can I say about the relationship between the predictions (aside from >> the obvious facts that they seem to be very different but their CIs are wide >> and overlap)? >> >> All | pr lb ub >> ----------+----------------------------------- >> | .193884 [.085888 .381067] >> | .029187 [.003519 .203809] >> ---------------------------------------------- >> Key: pr = Probability >> [lb , ub] = [95% Confidence Interval] >> Thanks. >> >> Michael * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Questions related to -predict-, -adjust-, and predictive margins***From:*"Michael I. Lichter" <mlichter@buffalo.edu>

**Re: st: Questions related to -predict-, -adjust-, and predictive margins***From:*Steven Samuels <sjhsamuels@earthlink.net>

- Prev by Date:
**st: bug in "egen" command?** - Next by Date:
**st: RE: bug in "egen" command?** - Previous by thread:
**Re: st: Questions related to -predict-, -adjust-, and predictive margins** - Next by thread:
**st: Modify large data base without opening it** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |