Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Count data model with underdispersion

 From F Tollnek To statalist@hsphsun2.harvard.edu Subject Re: st: Count data model with underdispersion Date Mon, 23 Jan 2012 13:44:09 +0100

```Dear Maarten,

```
thank you very much for your answer! I might also calculate a model, where I keep the persons with zero children. The problem is that the data stems from the 18th century and sometimes it is not clear if the people didn't have children or if they just were not reported. Also, there are couples with children, but from another marriage, where the number of children is set to zero; as we only would like to analyse families with children of both spouses, we have to leave out the couples with "zero" children at some point.
```
```
Regarding the modeling: yes, I want to calculate the average number of children given some external parameters.
```
Thanks again!

Best

Franziska

Zitat von Maarten Buis <maartenlbuis@gmail.com>:

```
```On Mon, Jan 23, 2012 at 11:43 AM, F Tollnek wrote:
```
```Because there are uncertainties about the people who reported "zero"
children, I have to truncate the data at the zero values. As often proposed,
I used the Zero Truncated Poisson model for analyzing my data, but the
problem is that there is significant underdispersion given.
```
```
I would need some really good reasons before doing that. In all
likelihood you have created more problems than you solved, and you are
much better off by keeping the childless persons (or couples) in your
analysis. I would not be surprised if it is still significantly
under-dispersed, but than you can use Joe Hilbe's -gnpoisson-, (see:
-ssc desc gnpoisson-).

All this depends on what you really want to model: the average number
of kids given some explanatory variables or the decision process that
leads to 0, 1, 2, ... kids. In the latter case none of these models
apply: The decision to have no children is very different from the
decision to 1 or more children. The decision between 1 or 2 or more
children is likely to be very different than the decision between 2
and 3 or more children, etc. So trying to summarize all these
decisions with one set of parameters is likely to fail.

This suggests to me a sequential logit model, but there are some
difficulties with giving a causal interpretation to the parameters,
which is probably what you have in mind when you are interested in the
decision process. It is even difficult to define what a causal effect
would be in such situations, see (Mare 2011).

Hope this helps,
Maarten

Robert D. Mare (2011) Introduction to symposium on unmeasured
heterogeneity in school transition models, Research in Social
Stratification and Mobility, Volume 29, Issue 3, Pages 239-245.

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```