Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpretation of interaction term in log linear (non linear) model

From	David Hoaglin <[email protected]>
To	[email protected]
Subject	Re: st: Interpretation of interaction term in log linear (non linear) model
Date	Sun, 16 Jun 2013 21:13:00 -0400

Dear Suryadipta,

Thanks for the further explanation.

If your dependent variable (Trade) is zero in about 15% of the
observations, I am skeptical that it would be adequate to use a
fixed-effects Poisson model without explicitly accounting for the
source of the zeros.  If a "zero" is due to non-reporting, shouldn't
it be a missing value?  With such a continuous dependent variable, it
would be unlikely to observe zero by chance.  If a pair of countries
is not able to trade, that would need to be accounted for as a
"structural zero," either in a separate part of the model or by
omitting those observations from the analysis.  That huge literature
may argue in favor of Poisson, but what is the empirical evidence on
how well the models fit?

I agree that it would be problematic to have 5000 explicit fixed
effects.  What is the source of such a large number of fixed effects?

As a model-building strategy it might be instructive to set aside the
idea of fixed effects and see what happens when you use random effects
instead.

Another useful strategy when you have a large amount of data is to
split the data into parts (usually at random, with appropriate
stratification if needed), perhaps two halves.  Set one of the parts
aside (out of sight) for use later in validating the final model.  Do
the model-building on the other part.  In one project that I worked on
several years ago, we used 50% of the data for no-holds-barred model
building, another 25% for fine-tuning the "final" model, and the other
25% to get an clean estimate of prediction error.

If the data come from time series, what does the analysis do about
serial correlation?

Regards,

David Hoaglin

On Fri, Jun 14, 2013 at 9:59 AM, Suryadipta Roy <[email protected]> wrote:
> Dear David,
> Thank you for the suggestions! I have cross country time series data
> (unbalanced panel) where the dependent variable is zero for about 15%
> of the observations. Many papers have recorded more zero-s, e.g. the
> paper by Silva and Tenreryo that I mentioned in the previous email
> reports about 50% of zero observations for the dependent variable
> (Bilateral Import/Export). I started with a fixed effects log-linear
> model (more traditional in the trade literature) and moved on to fixed
> effects Poisson (following Bill Gould's Stata Blog suggestions and the
> Stata meeting presentation by Austin Nichols :
> http://www.stata.com/meeting/boston10/boston10_nichols.pdf , as well
> as some other papers in the literature). I have indeed tried Negative
> Binomial and might report the results in the paper (but Stata does not
> have a true fixed effects NB model since the coefficients of the time
> invariant explanatory variables are reported (Paul Allison, "Fixed
> effects regression models", Sage, 2009, and some other issues
> discussed here: Guimarães, P., (2008), The fixed effects negative
> binomial model revisited, Economics Letters, 99, pp63–66), and the
> bootstrap standard errors in NB is taking forever to run with my data.
> Based on the theoretical development in the literature, I must control
> for fixed effects in my regressions. I have also tried -zip- and
> -zinb- but there is no conditional fixed effects model in Stata. I did
> not venture to introduce about 5000 fixed effects in my regressions
> with -zip- / -zinb- ; most likely these would take forever to run
> (with more 100,000 observations) , will not converge, and suffer from
> incidental parameters problem. I have also looked into hurdle models,
> but the question is if the zero-s are due to non-reporting of data or
> if countries are not able to trade for some other reasons- there is a
> huge literature in this area which have argued in favor of Poisson.
> Thank you very much for all the comments and helpful suggestions!
>
> Best regards,
> Suryadipta.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>

References:
- st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: David Hoaglin <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: David Hoaglin <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: David Hoaglin <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: David Hoaglin <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: David Hoaglin <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: David Hoaglin <[email protected]>
- Re: st: Interpretation of interaction term in log linear (non linear) model
  - From: Suryadipta Roy <[email protected]>

Prev by Date: Re: numeric value and value label from string [was: Re: st: From: Richard Moverare ....]
Next by Date: st: SV: SV: stcompet and cumulative incidence at a specific point in time
Previous by thread: Re: st: Interpretation of interaction term in log linear (non linear) model
Next by thread: Re: st: Interpretation of interaction term in log linear (non linear) model
Index(es):
- Date
- Thread