Hurdle models

Order

<- See Stata's other features

Highlights

Linear hurdle model
Exponential hurdle model
Conditional heteroskedastic models
One-bound models
Two-bound models
See more features for linear models

Hurdle models concern bounded outcomes. For instance, how much someone spends at the movies is bounded by zero. In this sense, hurdle models are much like tobit models. They differ in that hurdle models provide separate equations for the bounded and the unbounded outcomes, whereas tobit models use the same equation for both. Hurdle models assume the unbounded outcomes are the result of clearing a hurdle. When the hurdle is not cleared, bounded outcomes result.

Hurdle models come in two- and three-equation forms.

The two-equation form handles lower or upper bounding. The first equation determines whether you clear the hurdle, and the second determines the value of the outcome conditional on having cleared the hurdle.

The three-equation form handles lower and upper bounding. It adds another equation for clearing the second hurdle, and the middle equation is reinterpreted as determining the value conditional on having cleared both hurdles.

As an example, consider movie attendance or exercise. One equation determines whether you go to the movies (gym). Another equation determines how much you spent on the movies (exercising).

The Chilean health system categorizes people by age group and requires the purchase of health insurance. You can purchase the minimum. You can purchase the maximum. Or you can buy anywhere in between.

Hurdle models assume that the residuals of the hurdle equation(s) and the outcome equation are uncorrelated. For this assumption to be plausible, you typically must assume that it is different people who align themselves among the possible alternatives. In the Chilean health system, for instance, individuals who buy the minimum (maximum) are different from those who purchase an interior policy.

Hurdle models are especially popular in health applications where the different-person analogy is reasonable.

Let's see it work

We wish to model movie attendance. People first decide whether they will go to the movies at all—some people simply have no interest. Of those who have an interest, they then decide how much to spend per month on movies.

We will model attendance using number of hours worked, an indicator for working during weekends, and whether they have a newborn.

We will model amount spent per month using teenager, in a romantic relationship, and the number of children 6–10 (old enough to go to the movies but only with supervision).

We will model the hurdle as probit and the amount spent as a linear regression.

. churdle linear money dating teenager nkids, select(newborn hours 
     distance weekends) ll(0)

Iteration 0:   Log likelihood = -13836.472  
Iteration 1:   Log likelihood = -13836.472  

Cragg hurdle regression                                Number of obs =  10,000
                                                       LR chi2(3)    = 8775.37
                                                       Prob > chi2   =  0.0000
Log likelihood = -13836.472                            Pseudo R2     =  0.2408



       money   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

money         
      dating     5.024497   .0867425    57.92   0.000     4.854485    5.194509
    teenager     1.018596   .0500987    20.33   0.000     .9204039    1.116787
       nkids     4.968166   .0433092   114.71   0.000     4.883281     5.05305
       _cons     4.993553   .0152177   328.14   0.000     4.963727    5.023379

selection_ll  
     newborn    -.2088256   .0421982    -4.95   0.000    -.2915326   -.1261187
       hours    -.0546953    .006559    -8.34   0.000    -.0675506   -.0418399
    distance    -.2439437   .0079499   -30.68   0.000    -.2595253   -.2283621
    weekends    -.4658996   .0812674    -5.73   0.000    -.6251807   -.3066185
       _cons     .7544356   .0329405    22.90   0.000     .6898735    .8189977

lnsigma       
       _cons     .0020465   .0097069     0.21   0.833    -.0169786    .0210716

      /sigma     1.002049   .0097267                      .9831647    1.021295

We find people are less likely to decide to go to the movies if there is a newborn in the household, the more hours they worked, and if work involves weekends.

People who go to the movies are more likely to spend more if they are dating, if they are teenagers, and if they have children aged 6–10.

Tell me more

Read more about hurdle models in the Stata Base Reference Manual, see [R] churdle.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


money		Coefficient Std. err. z P>\|z\| [95% conf. interval]

money
dating		5.024497 .0867425 57.92 0.000 4.854485 5.194509
teenager		1.018596 .0500987 20.33 0.000 .9204039 1.116787
nkids		4.968166 .0433092 114.71 0.000 4.883281 5.05305
_cons		4.993553 .0152177 328.14 0.000 4.963727 5.023379

selection_ll
newborn		-.2088256 .0421982 -4.95 0.000 -.2915326 -.1261187
hours		-.0546953 .006559 -8.34 0.000 -.0675506 -.0418399
distance		-.2439437 .0079499 -30.68 0.000 -.2595253 -.2283621
weekends		-.4658996 .0812674 -5.73 0.000 -.6251807 -.3066185
_cons		.7544356 .0329405 22.90 0.000 .6898735 .8189977

lnsigma
_cons		.0020465 .0097069 0.21 0.833 -.0169786 .0210716

/sigma		1.002049 .0097267 .9831647 1.021295