Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Interpretation of interaction term in log linear (non linear) model |

Date |
Sun, 16 Jun 2013 21:13:00 -0400 |

Dear Suryadipta, Thanks for the further explanation. If your dependent variable (Trade) is zero in about 15% of the observations, I am skeptical that it would be adequate to use a fixed-effects Poisson model without explicitly accounting for the source of the zeros. If a "zero" is due to non-reporting, shouldn't it be a missing value? With such a continuous dependent variable, it would be unlikely to observe zero by chance. If a pair of countries is not able to trade, that would need to be accounted for as a "structural zero," either in a separate part of the model or by omitting those observations from the analysis. That huge literature may argue in favor of Poisson, but what is the empirical evidence on how well the models fit? I agree that it would be problematic to have 5000 explicit fixed effects. What is the source of such a large number of fixed effects? As a model-building strategy it might be instructive to set aside the idea of fixed effects and see what happens when you use random effects instead. Another useful strategy when you have a large amount of data is to split the data into parts (usually at random, with appropriate stratification if needed), perhaps two halves. Set one of the parts aside (out of sight) for use later in validating the final model. Do the model-building on the other part. In one project that I worked on several years ago, we used 50% of the data for no-holds-barred model building, another 25% for fine-tuning the "final" model, and the other 25% to get an clean estimate of prediction error. If the data come from time series, what does the analysis do about serial correlation? Regards, David Hoaglin On Fri, Jun 14, 2013 at 9:59 AM, Suryadipta Roy <sroy2138@gmail.com> wrote: > Dear David, > Thank you for the suggestions! I have cross country time series data > (unbalanced panel) where the dependent variable is zero for about 15% > of the observations. Many papers have recorded more zero-s, e.g. the > paper by Silva and Tenreryo that I mentioned in the previous email > reports about 50% of zero observations for the dependent variable > (Bilateral Import/Export). I started with a fixed effects log-linear > model (more traditional in the trade literature) and moved on to fixed > effects Poisson (following Bill Gould's Stata Blog suggestions and the > Stata meeting presentation by Austin Nichols : > http://www.stata.com/meeting/boston10/boston10_nichols.pdf , as well > as some other papers in the literature). I have indeed tried Negative > Binomial and might report the results in the paper (but Stata does not > have a true fixed effects NB model since the coefficients of the time > invariant explanatory variables are reported (Paul Allison, "Fixed > effects regression models", Sage, 2009, and some other issues > discussed here: Guimarães, P., (2008), The fixed effects negative > binomial model revisited, Economics Letters, 99, pp63–66), and the > bootstrap standard errors in NB is taking forever to run with my data. > Based on the theoretical development in the literature, I must control > for fixed effects in my regressions. I have also tried -zip- and > -zinb- but there is no conditional fixed effects model in Stata. I did > not venture to introduce about 5000 fixed effects in my regressions > with -zip- / -zinb- ; most likely these would take forever to run > (with more 100,000 observations) , will not converge, and suffer from > incidental parameters problem. I have also looked into hurdle models, > but the question is if the zero-s are due to non-reporting of data or > if countries are not able to trade for some other reasons- there is a > huge literature in this area which have argued in favor of Poisson. > Thank you very much for all the comments and helpful suggestions! > > Best regards, > Suryadipta. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**References**:**st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*David Hoaglin <dchoaglin@gmail.com>

**Re: st: Interpretation of interaction term in log linear (non linear) model***From:*Suryadipta Roy <sroy2138@gmail.com>

- Prev by Date:
**Re: numeric value and value label from string [was: Re: st: From: Richard Moverare ....]** - Next by Date:
**st: SV: SV: stcompet and cumulative incidence at a specific point in time** - Previous by thread:
**Re: st: Interpretation of interaction term in log linear (non linear) model** - Next by thread:
**Re: st: Interpretation of interaction term in log linear (non linear) model** - Index(es):