Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: count data truncated at one


From   Laurie Molina <molinalaurie@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: count data truncated at one
Date   Tue, 12 Jun 2012 10:41:50 -0500

Ok, thank you all, as always you have provided very useful insights.
I think I will go with the ologit. Just one more thing. ologit is
motivated by the existence of a latent variable and thrasheholds that
define the value of the observed discrete variable.
In my case, I do observe the underlying variable (payment/reference
number), when this value is in a neighborhood around 2, I say that it
pays 2 times the reference number, and so on.
How can I add this information to the estimation? To my understanding
ologit does not take that information in to account.
Sorry if I cannot provide very much additional information.
Thank you again,
LM


On Tue, Jun 12, 2012 at 6:33 AM, David Hoaglin <dchoaglin@gmail.com> wrote:
> So far, we have little information on the variable in question beyond
> the statements
> "People included in the regression are members of a group defined as
> people paying 2 to ten times a reference number."
> and
> "Most of the observations have y=2, then the frequencies are
> decreasing for higher values of y, but then when there is also a high
> frequency of observations with y=10."
> If values of y > 10 have been combined with y = 10 (perhaps because 10
> was the highest multiple possible in the particular setting), then, as
> Tirthankar suggested, the analysis should take the into account the
> censoring at 10.
>
> In my brief experience with Statalist, I have seen a number of
> questions that seek input on statistical analysis but give only
> generic information about the data.  The fact that, for example, the
> values of the dependent variable range from 2 to 10 is only a
> beginning.  Every actual application has a context, which usually has
> a substantial impact on successful analysis of the data.  As a
> consultant, I expect to have a dialog with a client, learning about
> the research question and the details of the data, before I recommend
> a particular analysis.  It may not be possible to share some details
> with the list (e.g., because they need to remain confidential), but
> lack of information limits our ability to give effective advice.  We
> often make a serious effort to be helpful, only to learn, when more
> information emerges, that we were not addressing the right question.
>
> David Hoaglin
>
> On Tue, Jun 12, 2012 at 4:08 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>> Tirthankar is clearly correct in underlining the possibility of a
>> customised model rather than forcing this into some pre-existing model
>> that is not quite right. Note that you would need, for credibility, to
>> ensure not only that the likelihood was defined appropriately but also
>> that predicted values fall within [2,10].
>>
>> Thar said, the substantive or scientific choice should hinge largely
>> on whether the response is considered as # iterms bought or the
>> probability of # iterms being bought. I think here my view is close to
>> that of David.
>>
>> Any way, who said that you are restricted to a single model?
>>
>> Nick
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index