Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Laurie Molina <molinalaurie@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Truncated at zero count data with underdispersion |

Date |
Mon, 11 Oct 2010 15:44:35 -0500 |

Thank you very much, i will work on your suggestion. I just would like to ask for some comments on the following: What do you think about a glm log gamma distribution? With the log link i ensure that the conditional expectation is positive, and i know i lose the posibility of predicting puntual probabilities, but with the log gamma i can have underdispersion with consistency, isnt it? Thank you again! On Mon, Oct 11, 2010 at 1:36 PM, Steve Samuels <sjsamuels@gmail.com> wrote: > "> Does anyone know any stata command that i could use to model zero >> truncated count data with underdispersion?" > > There are too possibilities: > 1) Your model is inadequate > or > 2) The Poisson distribution doesn't fit your data-my best guess. > > If the Poisson model doesn't fit, use -mlogit- or -ologit-, with > categories being the numbers of cell phones. You might have to > combine sparse categories. Since your goal is prediction in an > external data set, split the study data set into two parts; develop > the model on one part, and assess the predictive accuracy of the model > on the second. (There are probably also -jackknife- or -boostrap- > possibilities for getting cross-validated "honest" assessments of > accuracy.) > > Here's an example of assessing predictive accuracy from -mlogit-. The > predicted category is that with the highest probability, and > predictive criterion is the difference between observed and predicted > category and its root MSE. > > ***********CODE BEGINS************* > sysuse auto, clear > recode rep78 1/2 = 2 > mlogit rep78 mpg trunk > forvalues i = 2/5{ > predict p`i', outcome(`i') > } > egen pmax = rowmax(p2 p3 p4 p5) > > gen p_class = 2 > forvalues i =3/5{ > replace p_class = `i' if pmax ==p`i' > } > label var p_class "Predicted Category" > > gen diff = rep78 - p_class > tab diff > sum diff > scalar mse = r(var) + r(mean)^2 > di mse > > ***********CODE ENDS************** > > > > > On Sun, Oct 10, 2010 at 6:39 PM, Laurie Molina <molinalaurie@gmail.com> wrote: >> Hi all, >> >> I have a question, i hope somebody can help my. >> >> I am modelling count data truncated at zero: The number of cell phones >> of households with cells phone. The observed data goes from 1 to 9, >> with mean equals 1.89 and variance 1.14. >> I have done some underdispersion tests after running a poisson >> regression with the truncated data and i reject the one sided >> hypothesis of equidispersion with a p-value of cero. (the predicted >> values have mean 1.89 with variance equal .51). >> >> Regarding the latent variable, i have also availabre the number of >> cellphones of all the households, i.e. i have the data of the latent >> variable that goes from 0 to 9. Here the mean equals 1.15 and the >> variance equals 1.54. I have also done and underdispersion test after >> running a poisson regresion with all the data and i get >> underdispersion (the predicted values have mean 1.15 but variance >> equal .999). >> >> I am interested in the truncated regresion because i want to predict >> the number of cell phones of the HH who have cell phones. I mean, in >> addition to the data that i am using for this regresion, i have >> another list of households and i know wheter they have or they dont >> have a cell phone. But among the HH in that list, who do have a cell >> phone, i do not know how many of them they have, and i am interested >> in that. >> >> To my understand if i use a poisson regression, given that my data is >> truncated i will get inconsistent estimates because the conditional >> expectation will not be correctly specified as an exponential function >> of xbeta. >> So i have to use the command: >> **** >> ztp depvar indep var >> **** >> But if the latent variable is not poisson i will get inconsistent estimates. >> >> I know stata has also availabre the zero truncated negative binomial >> regression, but since i get underdispersion in the latent variable i >> think the data is not negative binomial distributed so i will still >> get inconsistent estimates. >> >> Does anyone know any stata command that i could use to model zero >> truncated count data with underdispersion? >> >> Thank you all very much in advance. >> >> Regards, >> >> Laurie. >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Truncated at zero count data with underdispersion***From:*Maarten buis <maartenbuis@yahoo.co.uk>

**Re: st: Truncated at zero count data with underdispersion***From:*Steve Samuels <sjsamuels@gmail.com>

**References**:**st: Truncated at zero count data with underdispersion***From:*Laurie Molina <molinalaurie@gmail.com>

**Re: st: Truncated at zero count data with underdispersion***From:*Steve Samuels <sjsamuels@gmail.com>

- Prev by Date:
**st: RE: simplifying foreach loop** - Next by Date:
**Re: st: Question on Dfactor and Gaps in Time Series** - Previous by thread:
**Re: st: Truncated at zero count data with underdispersion** - Next by thread:
**Re: st: Truncated at zero count data with underdispersion** - Index(es):