# st: RE: Count data regression-only a opinion...

 From "Rajesh Tharyan" To Subject st: RE: Count data regression-only a opinion... Date Thu, 23 Mar 2006 15:45:27 -0000

```Hi,

I am not the world expert on this but I am working on something at the
moment which has count data so I am keenly looking forward to an expert

But my thoughts are that,  to be a proper poisson you should have the
mean=variance. If mean is less than the variance then you have a
underdispersion problem (I think that is what it is called!!!) which sort of
implies that the events are not independent..(see stata reference manual
N-R technical note pg6.(under nbreg). But to apply a poisson model don't the
events have to be independent??

A negative binomial model would be better in my opinion..

rajesh

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Hugh Colaco
Sent: 23 March 2006 15:24
To: statalist
Subject: st: Count data regression

Sorry, I am just getting used to Gmail and am not sure how "me"
appeared as my user name. Here is my question again.

Hugh

My dependent variable is the number of days, a count variable which is
censored from below at 2. The summary stats are below (see # 1). As
you can see, the number of days ranges from 2 to 2426. Since the
unconditional variance > mean, I assume that I should use a Neg
binomial reg rather than a Poisson reg.

However, if I take the log of the number of days, then the
unconditional variance < mean (see # 2), so I could run a Poisson reg.

Any thoughts on the above? Is there any issue if I first take the log
and then run a poisson or neg binomial regression? Would it defeat the
very purpose of the count regression? Anything else I need to
consider?

Thanks,

Hugh

1)

Days
-------------------------------------------------------------
Percentiles      Smallest
1%            2              2
5%            2              2
10%            4              2       Obs                3728
25%            7              2       Sum of Wgt.        3728
50%           17                      Mean           47.29855
Largest       Std. Dev.      104.9396
75%           45           1268
90%          114           1607       Variance       11012.31
95%          187           1781       Skewness        8.80124
99%          448           2426       Kurtosis       131.2864

2)
lnDays
-------------------------------------------------------------
Percentiles      Smallest
1%     .6931472       .6931472
5%     .6931472       .6931472
10%     1.386294       .6931472       Obs                3728
25%      1.94591       .6931472       Sum of Wgt.        3728

50%     2.833213                      Mean           2.927268
Largest       Std. Dev.      1.298882
75%     3.806663       7.145196
90%     4.736198       7.382124       Variance       1.687095
95%     5.231109       7.484931       Skewness       .3821456
99%     6.104793       7.793999       Kurtosis       2.691119

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```