Title | Define constraints for parameters | |
Author | Weihua Guan, StataCorp |
In this particular model, heckman does not estimate the parameter rho directly, but estimates a transformation:
atanh_rho = 1/2*ln[(1+rho)]/(1−rho)]
It is estimated in a constant-only equation athrho. Thus we need to constrain the constant term of equation athrho to be 0 (rho=0 implies atanh_rho=0).
. constraint define 1 _b[/athrho]=0 . heckman ..., select(...) constraint(1)
Now let’s extend the answer to more general cases: how to define constraints on parameters of a model in Stata. The syntax is generally
constraint define # [ exp=exp | coefficientlist ]
When we want to fix a parameter at a certain value, it becomes
constraint define # [equation_name]coefficient_name = #
The equation_name may not be necessary for a single-equation model such as OLS. It is easy to apply this rule to the coefficient of a covariate.
constraint define # [equation_name]covariate_name = #
One can find the equation_name easily from the output. Often it is just the name of the dependent variable.
But how about other parameters in the model, such as rho in heckman? This needs some understanding on how Stata estimates those parameters. In ML estimation, Stata always defines them in separate equations, i.e., one equation for one parameter. Those equations are constant-only, and the estimated constants will be the estimated parameters. Often, some transformations are needed to fit the parameter spaces. For instance, the standard deviation sigma of a normal distribution should be always greater than 0, so a log-transformation will be used to allow the estimation (ln(sigma)) from −infinity to +infinity. One can check the Methods and Formulas section of the estimation command to find out if any transformation is applied.
Now let’s go back to the question in heckman. As described in the short answer, heckman does use a transformation to estimate rho.
atanh_rho = 1/2*ln[(1+rho)]/(1−rho)] (p.976 of [R] heckman)
Using the example in the manual
. use http://www.stata-press.com/data/r14/womenwk, clear . heckman wage educ age, select(married children educ age) Iteration 0: Log likelihood = -5178.7009 Iteration 1: Log likelihood = -5178.3049 Iteration 2: Log likelihood = -5178.3045 Heckman selection model Number of obs = 2,000 (regression model with sample selection) Selected = 1,343 Nonselected = 657 Wald chi2(2) = 508.44 Log likelihood = -5178.304 Prob > chi2 = 0.0000
wage | Coefficient Std. err. z P>|z| [95% conf. interval] | |
wage | ||
education | .9899537 .0532565 18.59 0.000 .8855729 1.094334 | |
age | .2131294 .0206031 10.34 0.000 .1727481 .2535108 | |
_cons | .4857752 1.077037 0.45 0.652 -1.625179 2.59673 | |
select | ||
married | .4451721 .0673954 6.61 0.000 .3130794 .5772647 | |
children | .4387068 .0277828 15.79 0.000 .3842534 .4931601 | |
education | .0557318 .0107349 5.19 0.000 .0346917 .0767718 | |
age | .0365098 .0041533 8.79 0.000 .0283694 .0446502 | |
_cons | -2.491015 .1893402 -13.16 0.000 -2.862115 -2.119915 | |
/athrho | .8742086 .1014225 8.62 0.000 .6754241 1.072993 | |
/lnsigma | 1.792559 .027598 64.95 0.000 1.738468 1.84665 | |
rho | .7035061 .0512264 .5885365 .7905862 | |
sigma | 6.004797 .1657202 5.68862 6.338548 | |
lambda | 4.224412 .3992265 3.441942 5.006881 | |
LR test of indep. eqns. (rho = 0): chi2(1) = 61.20 Prob > chi2 = 0.0000 |
Here are four equations: wage and select equation for those covariates, athrho for atanh_rho, and lnsigma for ln(sigma). The constant-only equation for a parameter is often displayed as /equation_name in the output table. The last row of the table displays the estimated value for rho, sigma, and lambda, which are transformed back from the estimation results.
Now we can impose the constraint on rho, which is actually on the constant term of equation athrho.
use http://www.stata-press.com/data/r14/womenwk, clear . local athrho=1/2*ln((1+0)/(1-0)) . constraint define 1 _b[/athrho]=0 . heckman wage educ age, select(married children educ age) constraint(1) Iteration 0: log likelihood = -5283.1781 Iteration 1: log likelihood = -5230.2173 Iteration 2: log likelihood = -5208.9358 Iteration 3: log likelihood = -5208.9038 Iteration 4: log likelihood = -5208.9038 Heckman selection model Number of obs = 2000 (regression model with sample selection) Censored obs = 657 Uncensored obs = 1343 Wald chi2(2) = 456.00 Log likelihood = -5208.904 Prob > chi2 = 0.0000 ( 1) [athrho]_cons = 0
wage | Coefficient Std. err. z P>|z| [95% conf. interval] | |
wage | ||
education | .8965829 .0497504 18.02 0.000 .7990738 .994092 | |
age | .1465739 .0186926 7.84 0.000 .1099371 .1832106 | |
_cons | 6.084875 .8886241 6.85 0.000 4.343204 7.826546 | |
select | ||
married | .4308575 .074208 5.81 0.000 .2854125 .5763025 | |
children | .4473249 .0287417 15.56 0.000 .3909922 .5036576 | education | .0583645 .0109742 5.32 0.000 .0368555 .0798735 |
age | .0347211 .0042293 8.21 0.000 .0264318 .0430105 | |
_cons | -2.467365 .1925635 -12.81 0.000 -2.844782 -2.089948 | |
/athrho | 0 (constrained) | |
/lnsigma | 1.694868 .0192951 87.84 0.000 1.65705 1.732686 | |
rho | 0 (omitted) | |
sigma | 5.445927 .1050797 5.243821 5.655824 | |
lambda | 0 (omitted) | |
The output shows that the constraint is applied correctly.