Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: which -cmp- option to use for poisson model with count data? |
Date | Mon, 7 May 2012 07:44:55 -0700 |
Doesn't the CLT refer to means, not counts? ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of David Roodman (droodman@cgdev.org) [DRoodman@cgdev.org] Sent: Sunday, May 06, 2012 7:24 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: which -cmp- option to use for poisson model with count data? It is my understanding that the Poisson model is for counts of rare events, such as emergency room admissions--something we certainly wouldn't expect to be normally distributed if truly rare. On the other hand, as the events become more common (imagine emergency room admissions at a big hospital in a big city), the distribution will converge to the normal distribution, as it must by the Central Limit Theorem. That is what I meant when I said that cmp would not be appropriate for low counts, but could be OK for high counts. However, it can be appropriate for low counts in other contexts. It is all a question of what we believe about the data generating process. For example, number of kids in a family. One can reasonably hypothesize that propensity to have more or fewer kids is an unobserved, continuous variable based on the normal distribution. It manifests as 0, 1, 2, etc. Then ordered probit is entirely appropriate. (The funny thing here is that you could argue that which model you use for number of kids could depend on whether pregnancies are planned or unplanned. If unplanned, I suppose the Poisson model is right!) --David --------