[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Using 2 stage Heckmen Sample Selection with Lags in STATA

Subject   Re: st: Using 2 stage Heckmen Sample Selection with Lags in STATA
Date   01 Aug 2007 19:42:17 +0100

Hi Austin

I wish it were as easy as having no missing data at all. I am dealing with
Sub Saharan African data and therefore I am lucky to have whatever data I
do for the first few years of my study - (1985-2005). I am looking at
bilateral trade between a country pair (Xij) therefore the exports of j are
the import values for i. this is my left hand side variables. I have
exhausted all options in terms of data sources for import values and yet
have about a quarter of my data missing. There are also a significant
number of zeroes in the dataset, however, since the main dataset indicated
quite clearly whether the data was not available or it was zero, the LHS
variable with zero import value is taken as an answer. That reduced my
problem a little bit but still leaves the missing values issue.

The Hausman test has indicated that I should be running a Random effects
model on my dataset but I have to account for the non random pattern of
zeroes first.

I hope that clarifies my position - please fire away if I am still being



>I think I already answered the question about the -heckman- approach
>likely being hard in your panel data, or perhaps impossible without a
>significant investment in learning -gllamm- on your part, or perhaps
>writing new routines.  But I still don't see necessary detail in your
>description of the data--how are you measuring trade?  Volume of
>exports+imports?  Net exports?  Indicator for any trade at all? Why is
>there missing data as opposed to just zeros for your LHS var?  If
>you've got trade measured as a strictly positive variable and missings
>where no trade exists, then you can replace trade=0 if mi(trade) and
>run xtpoisson with country pair fixed effects, right?
>On 8/1/07, Seema Bhatia <> wrote:
>> Thanks for this response - but I am not sure how to carry on. Having done
>> the Hausman Test, I have to fit a random effects model to study bilateral
>> trade as a function of gdps, populations, distance, landlockedness,
>> contiguity, cultural similarities and membership of trade blocs amongst
>> other things.
>> My panel contains cross sectional data for 26 country pairs over 21
>> years. I will be breaking this panel down into shorter time frames
>> anyway (5 year periods). However, I have a huge problem related to
>> missing data on the LHS (bilateral trade) and therefore need to model
>> it. I was hoping that the answer lies in using the Hechman Sample
>> Selection model - the non random pattern of the missing values could be
>> modelled using the Heckman two stage analysis.
>> Has anyone done this attempted this? I have seen this sample selection
>> method being used in recent economic literature quite a lot on panels.
>> There is also possible endogeneity issues within the model which is why
>> lags (first difference particularly) was something I am interested in
>> looking at. However, it is a much smaller problem as compared to my
>> problem of missing data and therefore I need to correct for the latter
>> first.
>> I would be grateful for further guidance on the issue and would
>> appreciate any code out there to help me do this.
>>> Seema--
>>> As Stas indicated in a prior post
>>> ( you
>>> cannot run -heckman- on panel data, though the -gllamm- and -ssm-
>>> (both available via -findit-) commands may provide alternatives. There
>>> are other alternatives out there, but you may not get much useful
>>> guidance without providing more info about your specific model.
>*   For searches and help try:
*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index