[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Heckman Selection Rule

From   "Maarten Buis" <>
To   <>
Subject   RE: st: Heckman Selection Rule
Date   Fri, 31 Aug 2007 15:02:51 +0200

--- georg wernicke wrote:
> Verbeek(2000) argues that the selection equation should at least
> contain all the variables the structural equation contains. however,
> Linder and de Groot (2006) argue that the variables of the two parts
> can be different.

This answer would be a lot more informative if you included the 
complete references.

--- Seema Bhatia wrote:
> Also, how does one verify that this 'identifying' variable that seperates
> the two equations is valid in the sense that it determines whether that case
> is selected or not but does not determine the LHS in the second step?

--- georg wernicke wrote:
> the unique variable the selection process should contain is probably a
> dummy which is used as the selection identifier. lets say you data for
> workers, some work some are unemployed. then create a dummy whether
> the worker has work or not and use this in the selection equation as
> the identifier.

The identifying variables mean something different here: these are the 
variables that influence the probability of being selected but not the 
outcome of equation of interest; this assumption make sure that the 
model is identified. It is not a variable that identifies which 
observation is selected and which is not. The latter variable is 
unnecessary when using -heckman- (the observations with a missing value
on the dependent variable are not selected, all others are.)

To answer Seema's original question: These types of models try to control 
for things you have not observed. As a result you do not have all the 
necessary information available in your dataset. The information you are 
missing comes from assumptions/theory, in this case the assumption that 
the identifying variable only influences the probability. If you could 
empirically verify that your identifying variable was good, you would not 
need -heckman-. This leads to a catch-22 situation: you either have to 
use heckman, but than you can't verify the identifying variable; or you 
can verify the identifying variable, but than you should not use -heckman-. 
So if you have to use -heckman-, an important part of the information 
contained in the parameter estimates do not come from your data, but from 
your theory. As a consequence I see -heckman- as primarily a theoretical 
exercise with a limited amount of empirical content, instead of an 
empirical estimate. 

hope it helps,


Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434 

+31 20 5986715

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index