# RE: st: Heckman Selection Rule

 From "Maarten Buis" To Subject RE: st: Heckman Selection Rule Date Fri, 31 Aug 2007 15:02:51 +0200

```--- georg wernicke wrote:
> Verbeek(2000) argues that the selection equation should at least
> contain all the variables the structural equation contains. however,
> Linder and de Groot (2006) argue that the variables of the two parts
> can be different.

complete references.

--- Seema Bhatia wrote:
> Also, how does one verify that this 'identifying' variable that seperates
> the two equations is valid in the sense that it determines whether that case
> is selected or not but does not determine the LHS in the second step?

--- georg wernicke wrote:
> the unique variable the selection process should contain is probably a
> dummy which is used as the selection identifier. lets say you data for
> workers, some work some are unemployed. then create a dummy whether
> the worker has work or not and use this in the selection equation as
> the identifier.

The identifying variables mean something different here: these are the
variables that influence the probability of being selected but not the
outcome of equation of interest; this assumption make sure that the
model is identified. It is not a variable that identifies which
observation is selected and which is not. The latter variable is
unnecessary when using -heckman- (the observations with a missing value
on the dependent variable are not selected, all others are.)

To answer Seema's original question: These types of models try to control
for things you have not observed. As a result you do not have all the
necessary information available in your dataset. The information you are
missing comes from assumptions/theory, in this case the assumption that
the identifying variable only influences the probability. If you could
empirically verify that your identifying variable was good, you would not
need -heckman-. This leads to a catch-22 situation: you either have to
use heckman, but than you can't verify the identifying variable; or you
can verify the identifying variable, but than you should not use -heckman-.
So if you have to use -heckman-, an important part of the information
contained in the parameter estimates do not come from your data, but from
your theory. As a consequence I see -heckman- as primarily a theoretical
exercise with a limited amount of empirical content, instead of an
empirical estimate.

hope it helps,

Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```