Hi,
I am not the world expert on this but I am working on something at the
moment which has count data so I am keenly looking forward to an expert
answer to your question from the list.
But my thoughts are that, to be a proper poisson you should have the
mean=variance. If mean is less than the variance then you have a
underdispersion problem (I think that is what it is called!!!) which sort of
implies that the events are not independent..(see stata reference manual
N-R technical note pg6.(under nbreg). But to apply a poisson model don't the
events have to be independent??
A negative binomial model would be better in my opinion..
rajesh
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Hugh Colaco
Sent: 23 March 2006 15:24
To: statalist
Subject: st: Count data regression
Sorry, I am just getting used to Gmail and am not sure how "me"
appeared as my user name. Here is my question again.
Hugh
My dependent variable is the number of days, a count variable which is
censored from below at 2. The summary stats are below (see # 1). As
you can see, the number of days ranges from 2 to 2426. Since the
unconditional variance > mean, I assume that I should use a Neg
binomial reg rather than a Poisson reg.
However, if I take the log of the number of days, then the
unconditional variance < mean (see # 2), so I could run a Poisson reg.
Any thoughts on the above? Is there any issue if I first take the log
and then run a poisson or neg binomial regression? Would it defeat the
very purpose of the count regression? Anything else I need to
consider?
Thanks,
Hugh
1)
Days
-------------------------------------------------------------
Percentiles Smallest
1% 2 2
5% 2 2
10% 4 2 Obs 3728
25% 7 2 Sum of Wgt. 3728
50% 17 Mean 47.29855
Largest Std. Dev. 104.9396
75% 45 1268
90% 114 1607 Variance 11012.31
95% 187 1781 Skewness 8.80124
99% 448 2426 Kurtosis 131.2864
2)
lnDays
-------------------------------------------------------------
Percentiles Smallest
1% .6931472 .6931472
5% .6931472 .6931472
10% 1.386294 .6931472 Obs 3728
25% 1.94591 .6931472 Sum of Wgt. 3728
50% 2.833213 Mean 2.927268
Largest Std. Dev. 1.298882
75% 3.806663 7.145196
90% 4.736198 7.382124 Variance 1.687095
95% 5.231109 7.484931 Skewness .3821456
99% 6.104793 7.793999 Kurtosis 2.691119
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/