Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Marginal effects in Probit


From   Maarten Buis <[email protected]>
To   [email protected]
Subject   Re: st: Marginal effects in Probit
Date   Tue, 7 Aug 2012 10:13:34 +0200

On Tue, Aug 7, 2012 at 4:45 AM, Shikha Sinha wrote:
> I am running a probit model as my dependent variable is binary. For
> ease of interpretation, I am estimating marginal effects at mean.
>
> My question: What is the relationship between the magnitudes of (a)
> probit coefficient and (b) marginal effects at mean? Is (a) always
> bigger than (b) or vice versa? Is it possible to say a prioro, on what
> factors these magnitudes would depend?

Yes, the marginal effect is normalden(xb)*b, where xb is the linear
predictor and b is the coefficient of the variable of interest. So,
the marginal effect will always be smaller than the probit coefficient
as the maximum value of the density function of a standard normal
distribution is a bit less than .4 (normalden(0) to be precise). In
the first part of the example below I show that this is indeed the
formula that Stata is using.

In the second part I illustrate that the marginal effect is not
constant across individuals. This is a logical consequence of fitting
a non-linear model like -probit-; if the marginal effect were constant
than we would be fitting a linear model. Typically, people don't want
to report many different marginal effects for the same variable, so
they instead report a summary measure of it, like the average marginal
effect. However, this leads to an inconsistency in their argument. By
only reporting (or only discussing) the average marginal effects they
are in effect turning their non-linear model into a linear model.
Either the non-linearity was not so important but than why estimate
your linear model in such a round-about two-step manner? You'd than be
much better off fitting a linear probability model, that is a much
more direct an honest way of presenting results from a linear
probability model. Or you think that the non-linearity in the probit
model is crucial, but than you cannot suffice with one-number
summaries of marginal effects because than you would undo that crucial
non-linearity.

I tend to prefer to choose the preferred (and, given the data,
possible ) metric of the effect and choose the model such that it
immediately returns that. If you don't do that --- and consequently
use things like marginal effects to force your preferred metric on a
model for whom it is not the natural metric --- than you'll always run
into the kind of "friction" or inconsistency discussed above.  If you
have a binary dependent variable and you want to ensure that the
predictions always remain between 0 and 1 than I tend to prefer odds
ratios, and thus a -logit- model instead of a -probit-. Odds and odds
ratios have an undeserved reputation of being hard to interpret, so
you need to be a bit careful about how you are going to present your
results. An example of one possibility of how to do that is given in
this Stata tip: M.L. Buis (2012) "Stata tip 107: The baseline is now
reported", The Stata Journal, 12(1), pp. 165-166.

*---------------------- begin example ---------------------------
sysuse nlsw88, clear
gen byte high_occ = occupation < 3 if occupation < .
probit union grade high_occ i.race

// collect the mean
tempname mgrade xb
sum grade if e(sample), meanonly
scalar `mgrade' = r(mean)

scalar `xb' = _b[_cons] + _b[grade]*`mgrade' + _b[high_occ]*0 ///
              + _b[2.race]*0 + _b[3.race]*0

// see the marginal effect
di normalden(`xb')*_b[grade]

// compare with the results computed by stata
margins, dydx(grade) atmeans at(high_occ=0 race=1)

// see how variable the marginal effect
// is across observations
predictnl marg = normalden(xb())*_b[grade] if e(sample)

twoway scatter marg grade

// so typically we report the average marginal effect
margins, dydx(grade)

// but in that case we are better of estimating a
// linear probability model instead
reg union grade high_occ i.race, vce(robust)
*----------------------- end example ----------------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )

Hope this helps,
Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index