<- See Stata 18's new features
One or more continuous endogenous covariates
Estimated covariance of endogenous error
Fractional outcomes are common. You might be modeling participation rates in a 401(k) pension plan, the pass rate on standardized tests, expenditure shares, or the like.
Fractional response models are a flexible and intuitive way to model outcomes that lie between 0 and 1. They do not have the problem of linear models that will yield predictions outside 0 and 1 or the problem of log-odds models that are undefined at 0 and 1. Fractional response models can be fit using the fracreg command.
What if you are concerned that one or more of your model covariates are endogenous? With the new ivfprobit command, you can fit a model for a fractional dependent variable and account for endogeneity in one or more of the covariates.
We want to study 401(k) participation rate (prate). We believe that corporate employment size (ltotemp) and its square are determinants of participation rates, as are an indicator of whether the 401(k) is the sole pension plan (sole) and the plan matching rate (mrate). We believe, however, that the plan matching rate is endogenous. In other words, there are unobserved determinants of participation rates that also affect the plan matching rate. For instance, matching rate and participation rate might be associated with industry practices and regional practices not observable in the data. To address endogeneity, we instrument matching rate using the age of the plan (age) and its square.
. ivfprobit prate c.ltotemp##c.ltotemp i.sole (mrate = c.age##c.age)
Inside the parentheses is the endogenous variable along with the instrumental variables we used to model it. Outside the parentheses are the exogenous variables, that affect prate directly. We get
. ivfprobit prate c.ltotemp##c.ltotemp i.sole (mrate = c.age##c.age) Fitting exogenous fractional probit model: Iteration 0: Log pseudolikelihood = -1769.7046 Iteration 1: Log pseudolikelihood = -1675.4223 Iteration 2: Log pseudolikelihood = -1674.7663 Iteration 3: Log pseudolikelihood = -1674.7661 Iteration 4: Log pseudolikelihood = -1674.7661 Fitting full model: Iteration 0: Log pseudolikelihood = -3712.498 Iteration 1: Log pseudolikelihood = -3712.4767 Iteration 2: Log pseudolikelihood = -3712.4767 Fractional probit model with endogenous regressors Number of obs = 4,075 Wald chi2(4) = 907.06 Log pseudolikelihood = -3712.4767 Prob > chi2 = 0.0000
|Coefficient std. err. z P>|z| [95% conf. interval]|
|mrate||1.907922 .0946094 20.17 0.000 1.722491 2.093353|
|ltotemp||-.4229273 .0744177 -5.68 0.000 -.5687833 -.2770713|
|c.ltotemp||.0217492 .0046476 4.68 0.000 .01264 .0308583|
|Only plan||-.1733119 .0366136 -4.73 0.000 -.2450733 -.1015504|
|_cons||1.904103 .3199032 5.95 0.000 1.277104 2.531102|
|e.prate)||-.5690386 .0431738 -.6476498 -.4784406|
|sd(e.mrate)||.3989664 .0061807 .3870345 .4112661|
We find a positive effect of the matching rate on the participation rate. Additionally, we see that the estimated correlation between the unobservables, corr(e.mrate, e.prate), is different from zero. This means there is evidence to support our endogeneity conjecture.