Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Laurie Molina <molinalaurie@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: count data truncated at one |
Date | Tue, 12 Jun 2012 10:41:50 -0500 |
Ok, thank you all, as always you have provided very useful insights. I think I will go with the ologit. Just one more thing. ologit is motivated by the existence of a latent variable and thrasheholds that define the value of the observed discrete variable. In my case, I do observe the underlying variable (payment/reference number), when this value is in a neighborhood around 2, I say that it pays 2 times the reference number, and so on. How can I add this information to the estimation? To my understanding ologit does not take that information in to account. Sorry if I cannot provide very much additional information. Thank you again, LM On Tue, Jun 12, 2012 at 6:33 AM, David Hoaglin <dchoaglin@gmail.com> wrote: > So far, we have little information on the variable in question beyond > the statements > "People included in the regression are members of a group defined as > people paying 2 to ten times a reference number." > and > "Most of the observations have y=2, then the frequencies are > decreasing for higher values of y, but then when there is also a high > frequency of observations with y=10." > If values of y > 10 have been combined with y = 10 (perhaps because 10 > was the highest multiple possible in the particular setting), then, as > Tirthankar suggested, the analysis should take the into account the > censoring at 10. > > In my brief experience with Statalist, I have seen a number of > questions that seek input on statistical analysis but give only > generic information about the data. The fact that, for example, the > values of the dependent variable range from 2 to 10 is only a > beginning. Every actual application has a context, which usually has > a substantial impact on successful analysis of the data. As a > consultant, I expect to have a dialog with a client, learning about > the research question and the details of the data, before I recommend > a particular analysis. It may not be possible to share some details > with the list (e.g., because they need to remain confidential), but > lack of information limits our ability to give effective advice. We > often make a serious effort to be helpful, only to learn, when more > information emerges, that we were not addressing the right question. > > David Hoaglin > > On Tue, Jun 12, 2012 at 4:08 AM, Nick Cox <njcoxstata@gmail.com> wrote: >> Tirthankar is clearly correct in underlining the possibility of a >> customised model rather than forcing this into some pre-existing model >> that is not quite right. Note that you would need, for credibility, to >> ensure not only that the likelihood was defined appropriately but also >> that predicted values fall within [2,10]. >> >> Thar said, the substantive or scientific choice should hinge largely >> on whether the response is considered as # iterms bought or the >> probability of # iterms being bought. I think here my view is close to >> that of David. >> >> Any way, who said that you are restricted to a single model? >> >> Nick > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/