Instrumental-variables fractional probit model

Order

<- See Stata 18's new features

Highlights

Fractional outcomes
One or more continuous endogenous covariates
Estimated covariance of endogenous error

Fractional outcomes are common. You might be modeling participation rates in a 401(k) pension plan, the pass rate on standardized tests, expenditure shares, or the like.

Fractional response models are a flexible and intuitive way to model outcomes that lie between 0 and 1. They do not have the problem of linear models that will yield predictions outside 0 and 1 or the problem of log-odds models that are undefined at 0 and 1. Fractional response models can be fit using the fracreg command.

What if you are concerned that one or more of your model covariates are endogenous? With the new ivfprobit command, you can fit a model for a fractional dependent variable and account for endogeneity in one or more of the covariates.

Let's see it work

We want to study 401(k) participation rate (prate). We believe that corporate employment size (ltotemp) and its square are determinants of participation rates, as are an indicator of whether the 401(k) is the sole pension plan (sole) and the plan matching rate (mrate). We believe, however, that the plan matching rate is endogenous. In other words, there are unobserved determinants of participation rates that also affect the plan matching rate. For instance, matching rate and participation rate might be associated with industry practices and regional practices not observable in the data. To address endogeneity, we instrument matching rate using the age of the plan (age) and its square.

We type

. ivfprobit prate c.ltotemp##c.ltotemp i.sole (mrate = c.age##c.age)

Inside the parentheses is the endogenous variable along with the instrumental variables we used to model it. Outside the parentheses are the exogenous variables, that affect prate directly. We get

. ivfprobit prate c.ltotemp##c.ltotemp i.sole (mrate = c.age##c.age)

Fitting exogenous fractional probit model:
Iteration 0:  Log pseudolikelihood = -1769.7046
Iteration 1:  Log pseudolikelihood = -1675.4223
Iteration 2:  Log pseudolikelihood = -1674.7663
Iteration 3:  Log pseudolikelihood = -1674.7661
Iteration 4:  Log pseudolikelihood = -1674.7661

Fitting full model:
Iteration 0:  Log pseudolikelihood =  -3712.498
Iteration 1:  Log pseudolikelihood = -3712.4767
Iteration 2:  Log pseudolikelihood = -3712.4767

Fractional probit model with endogenous regressors

                                                        Number of obs =  4,075
                                                        Wald chi2(4)  = 907.06
Log pseudolikelihood = -3712.4767                       Prob > chi2   = 0.0000


                              Robust

                Coefficient  std. err.     z    P>|z|     [95% conf. interval]

        mrate     1.907922   .0946094   20.17   0.000     1.722491    2.093353
      ltotemp    -.4229273   .0744177   -5.68   0.000    -.5687833   -.2770713
               
   c.ltotemp#                                                                 
    c.ltotemp     .0217492   .0046476    4.68   0.000       .01264    .0308583
               
         sole  
   Only plan     -.1733119   .0366136   -4.73   0.000    -.2450733   -.1015504
        _cons     1.904103   .3199032    5.95   0.000     1.277104    2.531102

 corr(e.mrate,                                                                 
      e.prate)    -.5690386   .0431738                    -.6476498   -.4784406
   sd(e.mrate)     .3989664   .0061807                     .3870345    .4112661
Wald test of exogeneity: chi2(1) = 102.40                 Prob > chi2 = 0.0000
Endogenous: mrate
Exogenous:  ltotemp c.ltotemp#c.ltotemp 1.sole age c.age#c.age

We find a positive effect of the matching rate on the participation rate. Additionally, we see that the estimated correlation between the unobservables, corr(e.mrate, e.prate), is different from zero. This means there is evidence to support our endogeneity conjecture.

Tell me more

View all the new features in Stata 18 and, in particular, New in instrumental-variables analysis.

Made for data science.

Get started today.

Order

Upgrade

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

2024 Stata Conference · 1-2 August · Portland, OR

View the program →

View the program →