Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Truncated at zero count data with underdispersion


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Truncated at zero count data with underdispersion
Date   Mon, 11 Oct 2010 20:07:26 -0400

"What do you think about a glm log gamma distribution?"

I don't think much of it. Your data are discrete and bounded.

Steve
[email protected]




On Mon, Oct 11, 2010 at 4:44 PM, Laurie Molina <[email protected]> wrote:
> Thank you very much, i will work on your suggestion.
> I just would like to ask for some comments on the following:
> What do you think about a glm log gamma distribution?
> With the log link i ensure that the conditional expectation is
> positive, and i know i lose the posibility of predicting puntual
> probabilities, but with the log gamma i can have underdispersion with
> consistency, isnt it?
>
> Thank you again!
>
>
>
>
>
> On Mon, Oct 11, 2010 at 1:36 PM, Steve Samuels <[email protected]> wrote:
>> "> Does anyone know any stata command that i could use to model zero
>>> truncated count data with underdispersion?"
>>
>> There are too possibilities:
>> 1) Your model is inadequate
>> or
>> 2) The Poisson distribution doesn't fit your data-my best guess.
>>
>> If the Poisson model doesn't fit, use -mlogit- or -ologit-,  with
>> categories being the numbers of cell phones.  You might have to
>> combine sparse categories.  Since your goal is prediction in an
>> external data set,  split the study data set into two parts; develop
>> the model on one part, and assess the predictive accuracy of the model
>> on the second.  (There are probably also -jackknife- or -boostrap-
>> possibilities for getting cross-validated "honest" assessments of
>> accuracy.)
>>
>> Here's an example of assessing predictive accuracy from -mlogit-. The
>> predicted category is that with the highest probability, and
>> predictive criterion is the difference between observed and predicted
>> category and its root MSE.
>>
>> ***********CODE BEGINS*************
>> sysuse auto, clear
>> recode rep78 1/2 = 2
>>  mlogit rep78 mpg trunk
>>  forvalues i = 2/5{
>>  predict p`i', outcome(`i')
>>  }
>>  egen pmax = rowmax(p2 p3 p4 p5)
>>
>> gen p_class = 2
>> forvalues i =3/5{
>> replace p_class = `i' if pmax ==p`i'
>> }
>> label var p_class "Predicted Category"
>>
>> gen diff = rep78 - p_class
>> tab diff
>> sum  diff
>> scalar mse = r(var) + r(mean)^2
>> di mse
>>
>> ***********CODE ENDS**************
>>
>>
>>
>>
>> On Sun, Oct 10, 2010 at 6:39 PM, Laurie Molina <[email protected]> wrote:
>>> Hi all,
>>>
>>> I have a question, i hope somebody  can help my.
>>>
>>> I am modelling count data truncated at zero: The number of cell phones
>>> of households with cells phone. The observed data goes from 1 to 9,
>>> with mean equals 1.89 and variance 1.14.
>>> I have done some underdispersion tests after running a poisson
>>> regression with the truncated data and i reject the one sided
>>> hypothesis of equidispersion with a p-value of cero. (the predicted
>>> values have mean 1.89 with variance equal .51).
>>>
>>> Regarding the latent variable, i have also availabre the number of
>>> cellphones of all the households, i.e. i have the data of the latent
>>> variable that goes from 0 to 9. Here the mean equals 1.15 and the
>>> variance equals 1.54. I have also done and underdispersion test after
>>> running a poisson regresion with all the data and i get
>>> underdispersion (the predicted values have mean 1.15 but variance
>>> equal .999).
>>>
>>> I am interested in the truncated regresion because i want to predict
>>> the number of cell phones of the HH who have cell phones. I mean, in
>>> addition to the data that i am using for this regresion,  i have
>>> another list of households and i know wheter they have or they dont
>>> have a cell phone. But among the HH in that list, who do have a cell
>>> phone, i do not know how many of them they have, and i am interested
>>> in that.
>>>
>>> To my understand if i use a poisson regression, given that my data is
>>> truncated i will get inconsistent estimates because the conditional
>>> expectation will not be correctly specified as an exponential function
>>> of xbeta.
>>> So i have to use the command:
>>> ****
>>> ztp depvar indep var
>>> ****
>>> But if the latent variable is not poisson i will get inconsistent estimates.
>>>
>>> I know stata has also availabre the zero truncated negative binomial
>>> regression, but since i get underdispersion in the latent variable i
>>> think the data is not negative binomial distributed so i will still
>>> get inconsistent estimates.
>>>
>>> Does anyone know any stata command that i could use to model zero
>>> truncated count data with underdispersion?
>>>
>>> Thank you all very much in advance.
>>>
>>> Regards,
>>>
>>> Laurie.
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index