Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Heckprob estimation question

From   "Sarah Edgington" <>
To   <>
Subject   st: Heckprob estimation question
Date   Wed, 10 Mar 2010 15:53:54 -0800

Hello all,
I am trying to estimate probit models with heckman selection using the
heckprob command (using Stata/SE 10.1 for windows).  I first noticed a
problem when I ran the same model twice and noted that Stata performed a
different number of iterations when fitting the models.  Since then I've run
the models a number of times with the trace and showstep options on and
confirmed that the iterations are slightly different each time.  In general,
the resulting coefficient and standard error estimates  all seem to be
approximately the same for each run (differences are in the fourth decimal
place when there are differences).  The fact that I'm not getting precisely
the same results each time is disconcerting but I would be less worried
about it were it not for the fact that sometimes the models do not converge
at all within a reasonable time frame.  The iterations for the first three
estimations steps--fitting probit model, fitting selection model, and
fitting starting values--are always the same.  It's when it gets to fitting
the full model that the runs start to diverge.  

I do have some independent variables that I am including in the main model
but not the selection model because they are only observed for the selected
population.  However, I've read a number of examples of the probit model
with heckman selection that don't include the same set of covariates in the
selection model leading me to believe that this strategy is not inherently
flawed.  Moreover, I seem to have the same estimation problem even if I
leave out entirely the explanatory variables that are only observed for the
selected population (thus running a model where the only variable that
differs between the selection model and the main model is the variable we're
using to identify the selection model).

I tried setting the seed but this doesn't seem to change the behavior.
Running the same model with the same seed set still results in different
iterations.  I'm now really worried since this suggests a) I have no way of
perfectly replicating my own results and b) there is something fundamental
about the way these models work that I clearly do not understand.

Has anyone seen this before?  Is this a sign of some underlying problem with
the data or the model?  If so, does anyone have any thoughts on how to
pinpoint what exactly is causing the problem?


-Sarah Edgington

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index