Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: RE: st: Hurdle model

From	Neil Hewitt <[email protected]>
To	"[email protected]" <[email protected]>
Subject	RE: RE: st: Hurdle model
Date	Mon, 16 Sep 2013 09:36:58 +0100

Dear Ariel,

Many thanks for your reply. I am not an econometrician so am feeling my way around the problems I am facing. The reason I chose a hurdle model is because as you rightly pointed out my outcome is a rare event so around 97% of my outcomes are zeros.

I assumed I had to correct for this problem as otherwise the parameter estimates would be biased. Of the two approaches used to remove excess zeros, the zero inflated or hurdle, the assumptions behind the latter seemed to fit my data better than the former. In particular I could not say some zeros were due to part of my population being excluded from the outcome/it being impossible for them to have one.

However as you say my separation is a somewhat false one to address my excess zero problem. What I am saying is that having had an event those patients differ in some way to those who haven't hence the hurdle separation. But it is clearly something I will need to give further thought on.

Thanks,

Neil



-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Ariel Linden, DrPH
Sent: 15 September 2013 21:40
To: [email protected]
Subject: re: RE: st: Hurdle model

Neil,

Your justification for choosing a hurdle model does not sound appropriate to me. 

You argue that "all individuals" have the same chronic illness and " there is little/no chance of them having an event without it being recorded."

How does this logic differentiate between those having/not having a hospitalization (the binary component), and those having multiple admissions (the count component)? The idea is that patients not having an admission (ie., not crossing the hurdle) will have certain characteristics that would differentiate them from those having multiple admissions (that's why you estimate two models here). There is nothing in your statements that would answer that point. 

I would argue that, in this instance, the main reason for crossing that hurdle is illness severity, and that would more reasonably be estimated as a covariate in a count model, as Jeph suggested.

More importantly, hospital admissions are rare events, even in chronically ill populations. Thus, it doesn't make even intuitive sense to me, for you to model these data separately, since even after the hurdle is crossed, I would expect the data to be highly skewed.

It is important for a researcher to think through these issues thoroughly, and thoughtfully, and not immediately go to the "fancy statistics".
Ultimately, I expect that you'll have to defend your choice of analysis, and that should rely on content knowledge as well as sufficient knowledge about statistical approaches.

Ariel
 

Date: Fri, 13 Sep 2013 10:04:16 +0100
From: Neil Hewitt <[email protected]>
Subject: RE: st: Hurdle model

Thanks Jeph,

I have used a hurdle model not a zero inflated model as I believe all my zeros are real and not structural. All individuals in my panel have the same chronic condition, the outcome is related to that chronic condition and there is little/no chance of them having an event without it being recorded.

That is how I understand the difference between the two though will not profess to be an expert.

Thanks,

Neil 

- -----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jeph Herrin
Sent: 13 September 2013 09:51
To: [email protected]
Subject: Re: st: Hurdle model

It sounds like what you have is a "zero inflated Poisson" model, not a hurdle model. Either way, I don't think there is a panel data version available (official or user written) for either in Stata. Almost certainly not for the hurdle model
- - I needed to estimate some hurdle models for panel data recently and dug around a lot before finally/reluctantly running them in SAS.

Jeph


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it.   Please do not use, copy or disclose the information contained in this message or in any attachment.  Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment
may still contain software viruses which could damage your computer system, you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.





*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- re: RE: st: Hurdle model
  - From: "Ariel Linden, DrPH" <[email protected]>

Prev by Date: AW: st: AW: xtabond2 - Sargan test and reducing instruments
Next by Date: Re: st: log-likelihood comparison of logit, loglog and cloglog?
Previous by thread: re: RE: st: Hurdle model
Next by thread: st: Error usind Denton
Index(es):
- Date
- Thread