Good morning all,
I am estimating a panel model using -xtnbreg- and -xtpoisson- in both
their random effects (RE) and fixed effects (FE) versions. The data
show signs of over-dispersion, as the standard deviation is consistently
about 50% to 150% higher than the mean. Also, if I estimate the model
using -nbreg-, ignoring the panel structure, the routine spits out the
results of a likelihood test that indicates that alpha is statistically
significantly different from zero, again indicating over-dispersion.
I would appreciate any help with the following questions (I am waiting
for Long's well-known tutorial to arrive in the mail, so I apologize if
the answers lie within):
(1) My results are moderately "better" (in the sense that the key
independent variables are more likely to be statistically significant in
a particular specification) using -xtpoisson- instead of its negative
binomial counterpart. The over-dispersion makes me obviously hesitant
to rely on these "better" results though. But what are the statistical
consequences of insisting on Poisson estimation when the sample is
over-dispersed? Does it bias coefficients? Or does it just make the
model less efficient? Should I run and hide from Poisson in the face of
over-dispersion? Or can its use be justified?
(2) From the Stata manual I understand that "fixed effects" and "random
effects" in the -xtnbreg- context refers to the modeling of the
dispersion parameter alpha, and not to FE and RE in the "normal" (e.g.
OLS) sense. Does the special nature of FE and RE for -xtnbreg-
estimation mean that, in comparing the -xtnbreg-results to their
-xtpoisson- FE and RE counterparts, that I am really comparing two
totally different beasts?
(3) Finally, but most importantly, it is very unclear to me how one goes
about deciding whether the -xtnbreg, fe- or -xtnbreg, re- model is the
most appropriate. In my case, results are "better" (the variables of
interest are more often statistically significant) when using the FE
option (whether using -xtnbreg- or -xtpoisson-). But using the FE
option also causes Stata to drop a number of groups from the regression
because of all-zero outcomes. Using the RE option, these groups remain
in the analysis. How should I go about justifying the use of one (RE) or
the other (FE)?
Thank you, and I will go enjoy my morning coffee now.
Jason Webb Yackee, PhD Candidate; J.D.
Fellow, Gould School of Law
University of Southern California
jyackee@law.usc.edu
Cell: 919-358-3040
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/