Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: heckman selection model in stata question

 From Maarten Buis To statalist@hsphsun2.harvard.edu Subject st: heckman selection model in stata question Date Mon, 28 Nov 2011 09:27:36 +0100

```The reason why Statalist bounced your messages is that you did not
sent it as plain text. See the statalist FAQ:
<http://www.stata.com/support/faqs/res/statalist.html> for a list of
further possible reasons why your email got bounced and what to do
about it. I'll forward your message to Statalist now, but I obviously
cannot do so every time any user has trouble.

The difference in your two predicted probabilities will also include
differences due to differences in the distribution of your control
variables. So if you do that you are no longer controlling for your
control variables. This is not necessarily bad, but it is something
you have to consciously decide.

The key part of the -margins- command is the
-expression(normal(xb(select)))- option. Notice that you cannot refer
to xb(stock). The reason is that  Stata internally calls the selection
equation select and not stock. -expression()- tells Stata what it
should take as the dependent variable. In this case you told it to
take -normal(xb(select))- as the dependent variable. -xb(select)-  is
the linear predictor of the selection equation, and -normal()- is the
cumulative distribution function for the normal distribution. The
selection equation is in essence a probit model, which means that it
transforms the linear predictor using the CDF of the normal
distribution to get the predicted probability. So -normal(xb(select))-
is the predicted probability to be selected, i.e. the predicted
probability of holding stock.

Hope this helps,
Maarten

--- Miguel Ampudia wrote me privately:
My name is Miguel Ampudia and I am a Phd student in economics at
Boston University. I have a problem I am not sure how to solve with
stata and i was wondering if you could give me your opinion. I have
registered into statalist but the server bounces my e-mails constantly
and I have had no response from the moderator when e-mailed him. I
would appreciate it if you could help me with this.

My problem is the following

I am running a heckman selection model and I want to get the effect of
my different exogenous variables on the probability of the section
variable ocurring. Being more specific, I am studying the decision of
holding stock and what percentage of my savings I invest in stock.
Whether I invest in stock or not is my selection variable. I use the
heckman command as summarized here:

heckman stock_share college ... , two sel(stock = college ... +
identification variables)

And then I want to asses the effect of having a college degree on the
probability of holding stock. I thought this could be done as follows:

predict prob1 if college == 0, psel

predict prob2 if college == 1, psel

Then I would get my result by doing result = prob2 - prob1

First, is this the right way of getting the effect on the probability
I want? If so, how do I calculate the standard error of result?

After reading one of your answers to a similar problem in statlist, it
seems that I could get what I want by doing:

margins, dydx(*) expression(normal(xb(stock)))

but I am not sure since I do not really understand the contents
specified in this command.

Thank you very much for your attention.

Best,

Miguel

--
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```