# Re: st: Heckman Selection Rule

 From "georg wernicke" To statalist@hsphsun2.harvard.edu Subject Re: st: Heckman Selection Rule Date Fri, 31 Aug 2007 17:09:50 +0100

```The complete references are:

Marno Verbeek,
A guide to modern econometrics.
2000
John Wiley & Sons,

and

Linders, G. J. M. and H. L. F. de Groot (2006). Estimation of the
Gravity Equation in the Presence of Zero Flows.

when using stata for the heckman it will ask you for a selection
dependent variable. use the dummy for that.

georg

On 8/31/07, Maarten Buis <M.Buis@fsw.vu.nl> wrote:
> --- georg wernicke wrote:
> > Verbeek(2000) argues that the selection equation should at least
> > contain all the variables the structural equation contains. however,
> > Linder and de Groot (2006) argue that the variables of the two parts
> > can be different.
>
> complete references.
>
> --- Seema Bhatia wrote:
> > Also, how does one verify that this 'identifying' variable that seperates
> > the two equations is valid in the sense that it determines whether that case
> > is selected or not but does not determine the LHS in the second step?
>
> --- georg wernicke wrote:
> > the unique variable the selection process should contain is probably a
> > dummy which is used as the selection identifier. lets say you data for
> > workers, some work some are unemployed. then create a dummy whether
> > the worker has work or not and use this in the selection equation as
> > the identifier.
>
> The identifying variables mean something different here: these are the
> variables that influence the probability of being selected but not the
> outcome of equation of interest; this assumption make sure that the
> model is identified. It is not a variable that identifies which
> observation is selected and which is not. The latter variable is
> unnecessary when using -heckman- (the observations with a missing value
> on the dependent variable are not selected, all others are.)
>
> To answer Seema's original question: These types of models try to control
> for things you have not observed. As a result you do not have all the
> necessary information available in your dataset. The information you are
> missing comes from assumptions/theory, in this case the assumption that
> the identifying variable only influences the probability. If you could
> empirically verify that your identifying variable was good, you would not
> need -heckman-. This leads to a catch-22 situation: you either have to
> use heckman, but than you can't verify the identifying variable; or you
> can verify the identifying variable, but than you should not use -heckman-.
> So if you have to use -heckman-, an important part of the information
> contained in the parameter estimates do not come from your data, but from
> your theory. As a consequence I see -heckman- as primarily a theoretical
> exercise with a limited amount of empirical content, instead of an
> empirical estimate.
>
> hope it helps,
>
> Maarten
>
> -----------------------------------------
> Maarten L. Buis
> Department of Social Research Methodology
> Vrije Universiteit Amsterdam
> Boelelaan 1081
> 1081 HV Amsterdam
> The Netherlands
>
> Buitenveldertselaan 3 (Metropolitan), room Z434
>
> +31 20 5986715
>
> http://home.fsw.vu.nl/m.buis/
> -----------------------------------------
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```