It may be that the zeros represent a _qualitatively_
subset deserving separate modelling. But that would
be substantive knowledge and isn't given here.
I can imagine various situations in which this
might arise:
1. The data are changes. The zeros represent a large subset
who didn't change over the period, or so the data say.
2. The data are amounts in accounts. Positive: the "bank"
owes the client, negative: the client owes the "bank",
zero: neither (perhaps no transactions, etc.).
In these examples, the zeros are definitively intermediate
qualitatively as well as quantitatively.
Nick
n.j.cox@durham.ac.uk
Feiveson, Alan
> I would think that you would need distinct models for the
> probability of
> a zero and for the the conditional non-zero distribution. Perhaps
> something like -heckman- might work. Even if the zero is not
> important,
> you can't use something like a normal distribution to model
> the variable
> unconditionally.
Nick Cox
> Without more known about the underlying science, it is difficult to
> comment.
>
> But one answer is that you don't necessarily need to do anything
> special. It is the conditional distribution of response given
> predictors
> that is the stochastic side of modelling, not the unconditional
> distribution. Besides, a spike near the middle is not much of a
> pathology compared with one at an extreme.
Francesca Gagliardi
> > I would be grateful if anyone could give me suggestions on
> how to deal
>
> > with a dependent variable that has a mass point at zero and is
> > continuosly distributed over negative and positive values.
> In such a
> > case, which is the most appropriate model to estimate?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/