Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpretation of interaction term in log linear (non linear) model


From   David Hoaglin <dchoaglin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Interpretation of interaction term in log linear (non linear) model
Date   Sun, 16 Jun 2013 21:13:00 -0400

Dear Suryadipta,

Thanks for the further explanation.

If your dependent variable (Trade) is zero in about 15% of the
observations, I am skeptical that it would be adequate to use a
fixed-effects Poisson model without explicitly accounting for the
source of the zeros.  If a "zero" is due to non-reporting, shouldn't
it be a missing value?  With such a continuous dependent variable, it
would be unlikely to observe zero by chance.  If a pair of countries
is not able to trade, that would need to be accounted for as a
"structural zero," either in a separate part of the model or by
omitting those observations from the analysis.  That huge literature
may argue in favor of Poisson, but what is the empirical evidence on
how well the models fit?

I agree that it would be problematic to have 5000 explicit fixed
effects.  What is the source of such a large number of fixed effects?

As a model-building strategy it might be instructive to set aside the
idea of fixed effects and see what happens when you use random effects
instead.

Another useful strategy when you have a large amount of data is to
split the data into parts (usually at random, with appropriate
stratification if needed), perhaps two halves.  Set one of the parts
aside (out of sight) for use later in validating the final model.  Do
the model-building on the other part.  In one project that I worked on
several years ago, we used 50% of the data for no-holds-barred model
building, another 25% for fine-tuning the "final" model, and the other
25% to get an clean estimate of prediction error.

If the data come from time series, what does the analysis do about
serial correlation?

Regards,

David Hoaglin

On Fri, Jun 14, 2013 at 9:59 AM, Suryadipta Roy <sroy2138@gmail.com> wrote:
> Dear David,
> Thank you for the suggestions! I have cross country time series data
> (unbalanced panel) where the dependent variable is zero for about 15%
> of the observations. Many papers have recorded more zero-s, e.g. the
> paper by Silva and Tenreryo that I mentioned in the previous email
> reports about 50% of zero observations for the dependent variable
> (Bilateral Import/Export). I started with a fixed effects log-linear
> model (more traditional in the trade literature) and moved on to fixed
> effects Poisson (following Bill Gould's Stata Blog suggestions and the
> Stata meeting presentation by Austin Nichols :
> http://www.stata.com/meeting/boston10/boston10_nichols.pdf , as well
> as some other papers in the literature). I have indeed tried Negative
> Binomial and might report the results in the paper (but Stata does not
> have a true fixed effects NB model since the coefficients of the time
> invariant explanatory variables are reported (Paul Allison, "Fixed
> effects regression models", Sage, 2009, and some other issues
> discussed here: Guimarães, P., (2008), The fixed effects negative
> binomial model revisited, Economics Letters, 99, pp63–66), and the
> bootstrap standard errors in NB is taking forever to run with my data.
> Based on the theoretical development in the literature, I must control
> for fixed effects in my regressions. I have also tried -zip- and
> -zinb- but there is no conditional fixed effects model in Stata. I did
> not venture to introduce about 5000 fixed effects in my regressions
> with -zip- / -zinb- ; most likely these would take forever to run
> (with more 100,000 observations) , will not converge, and suffer from
> incidental parameters problem. I have also looked into hurdle models,
> but the question is if the zero-s are due to non-reporting of data or
> if countries are not able to trade for some other reasons- there is a
> huge literature in this area which have argued in favor of Poisson.
> Thank you very much for all the comments and helpful suggestions!
>
> Best regards,
> Suryadipta.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index