Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: predict


From   Chiara Mussida <cmussida@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: predict
Date   Mon, 6 Jun 2011 10:07:46 +0200

Dear All,
many thanks to Maarten and Richard for their precious help.
One doubt remain unsolved:
when I compute the predicted probabilities from my mlogit as:

pr1 = exp(b0 + b1 x1)/(exp(b0 + b1 x1) + exp(b0 + b2 x2) + 1)

where pr1 is the predicted prob of outcome 1, b0 is a constant, b1 and
b2 the coefficients from outcome 1 and 2. here I assume that outcome 3
is the base category, and a totalo of three outcomes.

this computation, carried out by using the coefficients of the STATA
output (mlogit commands) differs from the outcome predicted by using
the predict command (which is a mlogit postestimation outcome), such
as:
Predict probabilities of outcome 1 for estimation sample
predict p1 if e(sample), outcome(1)

my question is: why the two computations offer different results for
predicted probabilities? Maybe related to the method of computation
behind predict command.

Many Thanks
C








On 3 June 2011 09:42, Maarten Buis <maartenlbuis@gmail.com> wrote:
> --- On 2 June 2011 18:08, Chiara Mussida wrote:
>> I simply want the coefficients (of my covariates) which allow me to
>> get the predicted outcome of each equation of my MNL.
>>
>> example: I get a predicted probability (say to move from employment to
>> unemployment) of 0.4:
>> what is the contribution (numerical) of each covariate I included in
>> my equation (suc as sex, individual age, etc.). Is it given by the
>> exponential of the coef I find in the Stata output? therefore by
>> summing/subtracting the exp of each coef I get my predicted of 0.4
>> (but there is also a standard error)
>
> The contribution of each variable to the predicted probability is
> neither its coefficient nor the exponential of that coefficient. It is
> a non-linear function you can find in any introductory text on
> multinomial regression. So you cannot use a set of additions of
> coefficients to get to the predicted probability.
>
> If you want to give a exact representation of the model you will have
> to look at relative risks or odds(*) (**), this is:
>
> relative risk = exp(b0 + b1 x1 + b2 x2 + ...)
>
> or, equivalently
>
> relative risk = exp(b0) * exp(b1 x1) * exp(b2 x2) * ...
>
> Alternatively, you can fit a linear model on top of your multinomial
> logistic regression, and use those results to summarize the results.
> This is what you do when you compute marginal effects. As this is the
> result of a model on top of a model it will not be an exact
> representation of the original multinomial regression model, so the
> addition of coefficients will in all likelihood lead to deviations
> from the actual predicted probabilities. on the plus side, you can now
> interpret your results in terms of probabilities instead of relative
> risks.
>
> The fact that marginal effects are not exact representation of the
> model results is not necessarily bad. Marginal effects form a model of
> your multinomial regression model, and models aren't supposed to be
> exact, they are only supposed to be useful. Whether or not this model
> of a model is useful depends on the exact aim of the exercise. If you
> do this in order to compute some kind of decomposition of effects,
> than I would stick to the exact representation, if I were presenting
> results than I would look at who my audience is. There are also cases
> where the underlying multinomial regression model is so complicated,
> that the linear approximation implicit in the marginal effects starts
> to struggle. For example it is not uncommon for correctly computed
> marginal effects of interaction terms to be significantly positive for
> some respondents, significantly negative for others, and
> non-significant for the remaining respondents. In most cases, that is
> hardly a useful conclusion.
>
> Hope this helps,
> Maarten
>
> (*) There are some differences between disciplines in whether the
> outcomes of a multinomial logistic regression can be called an odds or
> whether a new term like relative risk has to be invented for it. See,
> for example: <http://www.stata.com/statalist/archive/2007-02/msg00085.html>
>
> (**) Notice that I say here relative risk or odds, I did not say
> relative risk ratio or odds ratio. It is a common mistake to assume
> that these things are the same.
>
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
>
> http://www.maartenbuis.nl
> --------------------------
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Chiara Mussida
PhD candidate
Doctoral school of Economic Policy
Catholic University, Piacenza (Italy)

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index