[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Sample selection and endogeneity (or, combining heckman and ivreg)

From   kokootchke <>
To   statalist <>
Subject   RE: st: Sample selection and endogeneity (or, combining heckman and ivreg)
Date   Thu, 6 Aug 2009 02:30:01 -0400

Austin Nichols wrote:
> Shehzad's code has some typos (compare the if qualifiers), I think,
> and without a ref or a proof of consistency, I can't see how anyone
> would get those kinds of results published. To be able to trust them
> yourself, you would also want to run some simulations to assess finite
> sample performance for samples that look like yours (with true coefs
> picked to be near your estimated coefs). 

Austin, thank you very much for your response. I agree that not having
a reference would weaken my results and this is why I'm trying to see
if someone in this Stata group can point in the right direction. I have
thought about the simulations as well and I'm contemplating doing that,
but I've never done this before and would like some pointers as to
where I should start. Would you have any suggestions or do you have a
reference that could help in that regard? Also, what do you mean by "for samples that look like yours"?

> Also, note that the original
> poster specifies panel data, though what the DGP is, I do not know.
> Note in particular that the dependent var is a weighted average of
> spreads for one or multiple bonds in a given quarter but a weighted
> average of a nonnegative (skewed?) variable is not guaranteed to have
> any desirable properties. Also, still assuming that the dep var is
> nonnegative or strictly positive, OLS and heckman are inappropriate,
> relative to a GLM type model. Presumably, the new GMM models in Stata
> 11 are a good place to turn, assuming suitable moments can be
> specified.

This is a very good point. I have also never used GLM/GMM in this context before, so could you please be more specific regarding what I need to know or where I should look in order to consider this option and try to implement it?

Thank you very much once again!

> On Wed, Aug 5, 2009 at 4:52 AM, Shehzad Ali wrote:
>> To add to John's response, if your endogenous variable is binary, then I would use the following:
>>        probit y1 x1 x2 x3
>>        predict xbeta1, xb
>>        gen imills1=normd(xb)/normprob(xb) if y1==1
>>        replace imills1=-normd(xb)/(1-normprob(xb)) if y2==0
>>        heckman y2 y1 $yvar $zvar imills1 [pw=weight], sel(selection_probit= y1 $xvar imills1) cluster(commune) mills(imr2)
>> I have assumed that the endogenous var is endogenous in both selection and outcome equation.
>> Regards,
>> Shehzad
>> ----- Original Message ----
>>> From: John Antonakis 
>>> To:
>>> Sent: Wednesday, August 5, 2009 7:18:14 AM
>>> Subject: Re: st: Sample selection and endogeneity   (or, combining heckman and ivreg)
>>> Hi:
>>> One possibility is to manually obtain predicted values of the endogenous
>>> variables (using regress), which will give you consistent estimates.
>>> Then use the predicted values in the Heckman model and bootstrap the
>>> standard errors.
>>> HTH,
>>> John.
>>> On 05.08.2009 04:51, kokootchke wrote:
>>>> Dear all,
>>>> I am trying to estimate an equation in which the dependent variable is only
>>> observed when a selection rule applies (your typical sample selection problem a
>>> la Heckman). One of the independent variables in the main equation is
>>> endogenous, and I'd like to use instrumental variables to address that issue
>>> within the Heckman framework.
>>>> I haven't been able to find any papers or references that deal with this
>>> issue, especially because I have a panel dataset containing 40+ countries and
>>> about 60 time periods (quarters). My approach is to run the selection probit,
>>> then use the predicted values in a 2SLS framework. I guess I'd have to do some
>>> standard-error correction (any hints on this would also be useful)... but I
>>> wanted to ask if you guys could tell me whether there is a Stata command that
>>> does this or if there are any references you could suggest?
>>>> For more information on my particular case, please see below.
>>>> Thanks!
>>>> Adrian
>>>> p.s. A few more details on my model:
>>>> I want to estimate the effects of GDP growth and other macroeconomic variables
>>> on bond spreads, so my dependent variable in the main equation is the yield
>>> spread of a bond. The problem is that these spreads are primary market spreads
>>> or "spreads at launch", which means they are only observed at the moment a
>>> country places a bond in the market.
>>>> My panel data are organized at a quarterly frequency. Whether a country issues
>>> one or multiple bonds in a given quarter is irrelevant as I basically take a
>>> weighted average of all spreads issued in a given quarter and use that as my
>>> dependent variable.
>>>> However, there are quarters when a country may not issue a bond... and this is
>>> the selection problem I'm trying to get at using a Heckman model.
>>>> On top of this, if we believe that the spreads are somehow related to the
>>> level of interest rates in the country, then macroeconomic variables such as GDP
>>> growth are going to be endogenous. I have one (potentially two) instrumental
>>> variable I want to use, and this is why I want to do the 2SLS...
>>>> Do you guys have any other suggestions besides what I suggested above?
> *
> * For searches and help try:
> *
> *
> *

Windows Live™: Keep your life in sync.
*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index