Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Truncated at zero count data with underdispersion |

Date |
Mon, 11 Oct 2010 14:36:42 -0400 |

"> Does anyone know any stata command that i could use to model zero > truncated count data with underdispersion?" There are too possibilities: 1) Your model is inadequate or 2) The Poisson distribution doesn't fit your data-my best guess. If the Poisson model doesn't fit, use -mlogit- or -ologit-, with categories being the numbers of cell phones. You might have to combine sparse categories. Since your goal is prediction in an external data set, split the study data set into two parts; develop the model on one part, and assess the predictive accuracy of the model on the second. (There are probably also -jackknife- or -boostrap- possibilities for getting cross-validated "honest" assessments of accuracy.) Here's an example of assessing predictive accuracy from -mlogit-. The predicted category is that with the highest probability, and predictive criterion is the difference between observed and predicted category and its root MSE. ***********CODE BEGINS************* sysuse auto, clear recode rep78 1/2 = 2 mlogit rep78 mpg trunk forvalues i = 2/5{ predict p`i', outcome(`i') } egen pmax = rowmax(p2 p3 p4 p5) gen p_class = 2 forvalues i =3/5{ replace p_class = `i' if pmax ==p`i' } label var p_class "Predicted Category" gen diff = rep78 - p_class tab diff sum diff scalar mse = r(var) + r(mean)^2 di mse ***********CODE ENDS************** On Sun, Oct 10, 2010 at 6:39 PM, Laurie Molina <molinalaurie@gmail.com> wrote: > Hi all, > > I have a question, i hope somebody can help my. > > I am modelling count data truncated at zero: The number of cell phones > of households with cells phone. The observed data goes from 1 to 9, > with mean equals 1.89 and variance 1.14. > I have done some underdispersion tests after running a poisson > regression with the truncated data and i reject the one sided > hypothesis of equidispersion with a p-value of cero. (the predicted > values have mean 1.89 with variance equal .51). > > Regarding the latent variable, i have also availabre the number of > cellphones of all the households, i.e. i have the data of the latent > variable that goes from 0 to 9. Here the mean equals 1.15 and the > variance equals 1.54. I have also done and underdispersion test after > running a poisson regresion with all the data and i get > underdispersion (the predicted values have mean 1.15 but variance > equal .999). > > I am interested in the truncated regresion because i want to predict > the number of cell phones of the HH who have cell phones. I mean, in > addition to the data that i am using for this regresion, i have > another list of households and i know wheter they have or they dont > have a cell phone. But among the HH in that list, who do have a cell > phone, i do not know how many of them they have, and i am interested > in that. > > To my understand if i use a poisson regression, given that my data is > truncated i will get inconsistent estimates because the conditional > expectation will not be correctly specified as an exponential function > of xbeta. > So i have to use the command: > **** > ztp depvar indep var > **** > But if the latent variable is not poisson i will get inconsistent estimates. > > I know stata has also availabre the zero truncated negative binomial > regression, but since i get underdispersion in the latent variable i > think the data is not negative binomial distributed so i will still > get inconsistent estimates. > > Does anyone know any stata command that i could use to model zero > truncated count data with underdispersion? > > Thank you all very much in advance. > > Regards, > > Laurie. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Truncated at zero count data with underdispersion***From:*Laurie Molina <molinalaurie@gmail.com>

**References**:**st: Truncated at zero count data with underdispersion***From:*Laurie Molina <molinalaurie@gmail.com>

- Prev by Date:
**st: RE: RE: RE: Question on Dfactor and Gaps in Time Series** - Next by Date:
**Re: st: twoway lfit and time series operators** - Previous by thread:
**st: Truncated at zero count data with underdispersion** - Next by thread:
**Re: st: Truncated at zero count data with underdispersion** - Index(es):