»  Home »  Products »  Features »  Hurdle models

## Hurdle models

### Highlights

• Linear hurdle model
• Exponential hurdle model
• Conditional heteroskedastic models
• One-bound models
• Two-bound models

Hurdle models concern bounded outcomes. For instance, how much someone spends at the movies is bounded by zero. In this sense, hurdle models are much like tobit models. They differ in that hurdle models provide separate equations for the bounded and the unbounded outcomes, whereas tobit models use the same equation for both. Hurdle models assume the unbounded outcomes are the result of clearing a hurdle. When the hurdle is not cleared, bounded outcomes result.

Hurdle models come in two- and three-equation forms.

The two-equation form handles lower or upper bounding. The first equation determines whether you clear the hurdle, and the second determines the value of the outcome conditional on having cleared the hurdle.

The three-equation form handles lower and upper bounding. It adds another equation for clearing the second hurdle, and the middle equation is reinterpreted as determining the value conditional on having cleared both hurdles.

As an example, consider movie attendance or exercise. One equation determines whether you go to the movies (gym). Another equation determines how much you spent on the movies (exercising).

The Chilean health system categorizes people by age group and requires the purchase of health insurance. You can purchase the minimum. You can purchase the maximum. Or you can buy anywhere in between.

Hurdle models assume that the residuals of the hurdle equation(s) and the outcome equation are uncorrelated. For this assumption to be plausible, you typically must assume that it is different people who align themselves among the possible alternatives. In the Chilean health system, for instance, individuals who buy the minimum (maximum) are different from those who purchase an interior policy.

Hurdle models are especially popular in health applications where the different-person analogy is reasonable.

### Let's see it work

We wish to model movie attendance. People first decide whether they will go to the movies at all—some people simply have no interest. Of those who have an interest, they then decide how much to spend per month on movies.

We will model attendance using number of hours worked, an indicator for working during weekends, and whether they have a newborn.

We will model amount spent per month using teenager, in a romantic relationship, and the number of children 6–10 (old enough to go to the movies but only with supervision).

We will model the hurdle as probit and the amount spent as a linear regression.

. churdle linear money dating teenager nkids, select(newborn hours weekends) ll(0)

Cragg hurdle regression                        Number of obs     =     10,000
LR chi2(3)        =    8775.37
Prob > chi2       =     0.0000
Log likelihood = -20230.563                    Pseudo R2         =     0.2408

money       Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

money
dating    15.07349   .2602275    57.92   0.000     14.56345    15.58353
teenager    3.055787   .1502961    20.33   0.000     2.761212    3.350362
nkids     14.9045   .1299277   114.71   0.000     14.64984    15.15915
_cons    14.98066    .045653   328.14   0.000     14.89118    15.07014

selection_ll
newborn   -.1832054   .0408579    -4.48   0.000    -.2632854   -.1031254
hours   -.0476496   .0063111    -7.55   0.000     -.060019   -.0352802
weekends   -.4235522   .0788783    -5.37   0.000    -.5781509   -.2689536
_cons    .2977912   .0285355    10.44   0.000     .2418626    .3537199

lnsigma
_cons    1.100659   .0097069   113.39   0.000     1.081634    1.119684

/sigma    3.006146   .0291802                      2.949494    3.063885


We find people are less likely to decide to go to the movies if there is a newborn in the household, the more hours they worked, and if work involves weekends.

People who go to the movies are more likely to spend more if they are dating, if they are teenagers, and if they have children aged 6–10.

### Tell me more

Read more about hurdle models in the Stata Base Reference Manual, see [R] churdle.