# st: Re: RE: RE: RE: RE: RE: RE: Dependent variable [with zero mass point]

 From "Francesca Gagliardi" To Subject st: Re: RE: RE: RE: RE: RE: RE: Dependent variable [with zero mass point] Date Wed, 6 Sep 2006 16:10:18 +0200

Dear Nick and Al,

Thank you very much for your replies to my question. I apologise for not having specified better how my dependent variable has been obtained. It is just a growth rate of firms, calculated as (employees at time t - employees at time t-1)/employees at time t-1. I have 15678 observations, of which 23.7% are negative values, 39.2% are zeros and the remaining are positive values

Best wishes,
Francesca

----- Original Message ----- From: "Nick Cox" <n.j.cox@durham.ac.uk>
To: <statalist@hsphsun2.harvard.edu>
Sent: Wednesday, September 06, 2006 4:30 PM
Subject: st: RE: RE: RE: RE: RE: RE: Dependent variable [with zero mass point]

```In order not to go round in circles, I will
merely record that the assertion "would not
apply" appears to me a tad dogmatic and certainly
disputable. A choice of model in my book should depend on
a match of science and statistics, and if a
spike is unsurprising scientifically I don't
that statistical considerations dictate a
Heckman approach at all.

Without input from the original poster
on what the science is here and on what she is doing or intending,
it appears difficult to make progress here.

Nick
n.j.cox@durham.ac.uk

Feiveson, Alan H.

```
```Nick - I have no quarrel with your thoughts on absolutely continuous
distributions that have infinite range in both directions -
(by the way,
a skew-normal is another alternative) however if there is a
mass at zero
(or any other discrete point(s)) these models would not
apply. In those
cases a separate model explaining the mass at zero is needed. One can
then use one of your proposed models to explain the distribution
conditional on a non-zero. From what I have seen on
statalist, it sounds
as if the Heckman model might be useful for some of these situations.

```
```Nick Cox

```
```Reminds me that I wrote a few expository paragraphs a while
back on one
device for variables with values of both signs.
Here is a slightly edited reprise of part of transint.hlp
from SSC as -transint-):
```
```Nick Cox

```
```> It may be that the zeros represent a _qualitatively_ subset
deserving
> separate modelling. But that would be substantive knowledge
and isn't
> given here.
```
```Feiveson, Alan
```
```> > I would think that you would need distinct models for the
> > probability of a zero and for the the conditional non-zero
> > distribution. Perhaps something like -heckman- might
work. Even if
> > the zero is not important, you can't use something like a normal
> > distribution to model the variable unconditionally.
```
```Nick Cox

```
```> > Without more known about the underlying science, it is
difficult to
> > comment.
> >
> > But one answer is that you don't necessarily need to do anything
> > special. It is the conditional distribution of response given
> > predictors
> > that is the stochastic side of modelling, not the unconditional
> > distribution. Besides, a spike near the middle is not much of a
> > pathology compared with one at an extreme.
>
> Francesca Gagliardi
>
> > > I would be grateful if anyone could give me suggestions on
> > how to deal
> >
> > > with a dependent variable that has a mass point at zero and is
> > > continuosly distributed over negative and positive values.
> > In such a
> > > case, which is the most appropriate model to estimate?
```
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```