Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: count data truncated at one


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: count data truncated at one
Date   Tue, 12 Jun 2012 09:08:15 +0100

Tirthankar is clearly correct in underlining the possibility of a
customised model rather than forcing this into some pre-existing model
that is not quite right. Note that you would need, for credibility, to
ensure not only that the likelihood was defined appropriately but also
that predicted values fall within [2,10].

Thar said, the substantive or scientific choice should hinge largely
on whether the response is considered as # iterms bought or the
probability of # iterms being bought. I think here my view is close to
that of David.

Any way, who said that you are restricted to a single model?

Nick

On Tue, Jun 12, 2012 at 5:34 AM, Tirthankar Chakravarty
<tirthankar.chakravarty@gmail.com> wrote:

> Substantive issues aside, a statistical model for the kind of outcomes
> you have is available in -tnbreg- which fits a truncated (at arbitrary
> positive value) Negative Binomial model to your data. Note, however,
> it does not handle top _censoring_ of the data at 10 - which from the
> description of your dataset there appears to be. To accommodate this
> possibility, you might want to look at Chapter 12 in "Negative
> Binomial Regression" (Hilbe, 2011, Cambridge University Press) :
> http://dx.doi.org/10.1017/CBO9780511973420.013
> to build a model of lower truncation, upper censoring, write out its
> likelihood and estimate via -ml-.

 On Mon, Jun 11, 2012 at 7:39 PM, David Hoaglin <dchoaglin@gmail.com> wrote:

>> If people were included because they paid 2, 3, ..., 10 times a
>> reference number, the multiple does not look like the value of a
>> dependent variable.  Instead, it looks like the definition of 9
>> subgroups.  If the regression model is trying to predict the subgroup
>> that a person belongs to, -ologit- may be an appropriate approach,
>> especially with the higher frequency at 10x.

On Mon, Jun 11, 2012 at 9:01 PM, Laurie Molina <molinalaurie@gmail.com> wrote:

>>> Yes, there are structural reasons why only those responses are possible.
>>> People included in the regression are members of a group defined as
>>> people paying 2 to ten times a reference number.
>>> I was thinking in ologit, but as there is cardinality involved, I was
>>> looking for a method that would consider all the available
>>> information, that is a method that would consider both the cardinal
>>> and ordinal properties of my data.
>>> I was thinking on reescaling the dataset so that 2 becomes 0, 3
>>> becomes 1, and so on. I know that this would not solver the high
>>> frequency of 10's (8's after reescaling), but I think my coefficients
>>> will still consistently estimate population parameters, as maximum
>>> likelihood estimation with poisson is robust to incorrer
>>> especification of the distribution as long as the conditional
>>> expectation function is correctly specified...
>>> Would it be terrible to do such a reescalation?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index