Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: count data truncated at one


From   Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: count data truncated at one
Date   Mon, 11 Jun 2012 21:34:40 -0700

Laurie,

Substantive issues aside, a statistical model for the kind of outcomes
you have is available in -tnbreg- which fits a truncated (at arbitrary
positive value) Negative Binomial model to your data. Note, however,
it does not handle top _censoring_ of the data at 10 - which from the
description of your dataset there appears to be. To accommodate this
possibility, you might want to look at Chapter 12 in "Negative
Binomial Regression" (Hilbe, 2011, Cambridge University Press) :
http://dx.doi.org/10.1017/CBO9780511973420.013
to build a model of lower truncation, upper censoring, write out its
likelihood and estimate via -ml-.

T

On Mon, Jun 11, 2012 at 7:39 PM, David Hoaglin <dchoaglin@gmail.com> wrote:
> Laurie,
>
> If people were included because they paid 2, 3, ..., 10 times a
> reference number, the multiple does not look like the value of a
> dependent variable.  Instead, it looks like the definition of 9
> subgroups.  If the regression model is trying to predict the subgroup
> that a person belongs to, -ologit- may be an appropriate approach,
> especially with the higher frequency at 10x.
>
> David Hoaglin
>
> On Mon, Jun 11, 2012 at 9:01 PM, Laurie Molina <molinalaurie@gmail.com> wrote:
>> Nick,
>> Thanks for your repply.
>> Yes, there are structural reasons why only those responses are possible.
>> People included in the regression are members of a group defined as
>> people paying 2 to ten times a reference number.
>> I was thinking in ologit, but as there is cardinality involved, I was
>> looking for a method that would consider all the available
>> information, that is a method that would consider both the cardinal
>> and ordinal properties of my data.
>> I was thinking on reescaling the dataset so that 2 becomes 0, 3
>> becomes 1, and so on. I know that this would not solver the high
>> frequency of 10's (8's after reescaling), but I think my coefficients
>> will still consistently estimate population parameters, as maximum
>> likelihood estimation with poisson is robust to incorrer
>> especification of the distribution as long as the conditional
>> expectation function is correctly specified...
>> Would it be terrible to do such a reescalation?
>> Thank you again!
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



-- 
Tirthankar Chakravarty
tchakravarty@ucsd.edu
tirthankar.chakravarty@gmail.com

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index