Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: count data truncated at one

From   David Hoaglin <>
Subject   Re: st: count data truncated at one
Date   Tue, 12 Jun 2012 07:33:23 -0400

So far, we have little information on the variable in question beyond
the statements
"People included in the regression are members of a group defined as
people paying 2 to ten times a reference number."
"Most of the observations have y=2, then the frequencies are
decreasing for higher values of y, but then when there is also a high
frequency of observations with y=10."
If values of y > 10 have been combined with y = 10 (perhaps because 10
was the highest multiple possible in the particular setting), then, as
Tirthankar suggested, the analysis should take the into account the
censoring at 10.

In my brief experience with Statalist, I have seen a number of
questions that seek input on statistical analysis but give only
generic information about the data.  The fact that, for example, the
values of the dependent variable range from 2 to 10 is only a
beginning.  Every actual application has a context, which usually has
a substantial impact on successful analysis of the data.  As a
consultant, I expect to have a dialog with a client, learning about
the research question and the details of the data, before I recommend
a particular analysis.  It may not be possible to share some details
with the list (e.g., because they need to remain confidential), but
lack of information limits our ability to give effective advice.  We
often make a serious effort to be helpful, only to learn, when more
information emerges, that we were not addressing the right question.

David Hoaglin

On Tue, Jun 12, 2012 at 4:08 AM, Nick Cox <> wrote:
> Tirthankar is clearly correct in underlining the possibility of a
> customised model rather than forcing this into some pre-existing model
> that is not quite right. Note that you would need, for credibility, to
> ensure not only that the likelihood was defined appropriately but also
> that predicted values fall within [2,10].
> Thar said, the substantive or scientific choice should hinge largely
> on whether the response is considered as # iterms bought or the
> probability of # iterms being bought. I think here my view is close to
> that of David.
> Any way, who said that you are restricted to a single model?
> Nick
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index