Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: How to model a positive continuous dependent variable with many zeros?

 From Hitesh Chandwani To statalist@hsphsun2.harvard.edu Subject Re: st: How to model a positive continuous dependent variable with many zeros? Date Tue, 31 May 2011 09:19:02 -0400

```Have you considered a two-part model? You could first run a logistic
regression to get the probability of having a non-zero seclusion
duration. Then, depending on the distribution of non-zero duration,
you could run the relevant GLM. Of course, whether or not to run a
two-part model will depend on the question you are trying to answer.

Regards,
Hitesh S. Chandwani
University of Texas at Austin

On Tue, May 31, 2011 at 3:24 AM, Adriaan Hoogendoorn
<aw.hoogendoorn@gmail.com> wrote:
> Dear Statalisters,
>
> I try to run regression models for two dependent variables that
> concern the seclusions of psychiatric patients:
> y1 = the number of seclusion incidents and y2 = the seclusion duration.
> Fortunately (at least from the patients perspective), there are many zeroes.
> I successfully applied a Poisson model (xtpoisson) to model the number
> of seclusion incidents
> for patients (level 1) in different clinics (level 2)
> taking exposure time t (the time that a psychiatric patient spent in
> the clinic) into account:
> xtpoisson y1 x1 x2 x3, re exposure(t)
>
> I am running into problems when I try modeling the duration of seclusions.
> Because of the many zeroes (85%) and the successful analysis of the
> number of seclusion incidents, I applied the xtpoisson model for the duration
> variable as well. Obviously duration it is not exactly a count variable,
> but I can count the number of hours within the duration, or the number
> of days, can’t I? So I estimated the duration counting  the number of hours.
>
> The problem appears when I alternatively estimate the duration by counting
> the number of hours. I seem to get different model estimates for duration when
> I count the number of days than when I count the number of hours. In fact,
> not so much the parameter estimates change, but their significance levels
> are very sensitive to the scale (days or hours or even minutes) on which
> duration is measured.
>
> It appears that in
> xtpoisson y2 x1 x2 x3, re exposure(t)
> it does not matter if “t” is measured in days or hours, but it does
> matter if the duration “y2” is measured in days, hours or minutes.
>
> So my trick of counting days or hours seems to fail, and modeling
> seclusion duration by a poisson model seems not a good idea.
> Therefore my question to you is: do you know of a model that can deal
> with a positive continuous dependent variable (duration) with many
> zeros?
>
> Kind regards,
> GGZ inGeest, Amsterdam
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Hitesh S. Chandwani
University of Texas at Austin

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```