- Simply prefix your sample-selection command with
**bayes:** - Linear, binary, and ordinal outcomes
- Default and custom prior distributions
- Full Bayesian-features support

Sample selection arises when the sampled data are not representative of the population of interest. A classic example of sample selection is women's work participation. Suppose that we want to model the wages of women. If we consider only the sample of women who chose to work, we may end up with a sample in which the wages are too high because women who would have low wages may have chosen not to work. Of course, if the decision whether to work is random, there would be no problem with using only the sample of women who work. This is not a realistic assumption in this case. To obtain valid inference in this example, we must model the outcome, the wages, and the decision to work. We will refer to the two models as the outcome model and the participation model.

In Stata, you can use heckman to fit a Heckman selection model to
continuous outcomes, heckprobit to fit a probit sample-selection model
to binary outcomes, and heckoprobit to fit an ordered probit model with
sample selection to ordinal outcomes. You can now simply prefix these commands
with **bayes:**
to fit the corresponding Bayesian sample-selection models.

Continuing with our example of women's work participation,
we first fit the classical Heckman sample-selection model. Below we model
both the wages and the decision to work based on the

level of education
and age. For the decision to work, we additionally include marriage
status and

number of children.

.heckman wage educ age, select(married children educ age)Heckman selection model Number of obs = 2,000 (regression model with sample selection) Selected = 1,343 Nonselected = 657 Wald chi2(2) = 508.44 Log likelihood = -5178.304 Prob > chi2 = 0.0000

wage | Coef. Std. Err. z P>|z| [95% Conf. Interval] | |

wage5 | ||

education | .9899537 .0532565 18.59 0.000 .8855729 1.094334 | |

age | .2131294 .0206031 10.34 0.000 .1727481 .2535108 | |

_cons | .4857752 1.077037 0.45 0.652 -1.625179 2.59673 | |

select | ||

married | .4451721 .0673954 6.61 0.000 .3130794 .5772647 | |

children | .4387068 .0277828 15.79 0.000 .3842534 .4931601 | |

education | .0557318 .0107349 5.19 0.000 .0346917 .0767718 | |

age | .0365098 .0041533 8.79 0.000 .0283694 .0446502 | |

_cons | -2.491015 .1893402 -13.16 0.000 -2.862115 -2.119915 | |

/athrho | .8742086 .1014225 8.62 0.000 .6754241 1.072993 | |

/lnsigma | 1.792559 .027598 64.95 0.000 1.738468 1.84665 | |

rho | .7035061 .0512264 .5885365 .7905862 | |

sigma | 6.004797 .1657202 5.68862 6.338548 | |

lambda | 4.224412 .3992265 3.441942 5.006881 | |

To fit its Bayesian analog, we use **bayes: heckman**.

.bayes: heckman wage educ age, select(married children educ age)

Model summary | ||

Likelihood: | ||

wage ~ heckman(xb_wage,xb_select,{athrho} {lnsigma}) | ||

Priors: | ||

{wage:education age _cons} ~ normal(0,10000) (1) | ||

{select:married children education age _cons} ~ normal(0,10000) (2) | ||

{athrho lnsigma} ~ normal(0,10000) | ||

(1) Parameters are elements of the linear form xb_wage. (2) Parameters are elements of the linear form xb_select. |

Equal-tailed | ||

Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||

wage | ||

education | .9919131 .051865 .002609 .9931531 .8884407 1.090137 | |

age | .2131372 .0209631 .001071 .2132548 .1720535 .2550835 | |

_cons | .4696264 1.089225 .0716 .4406188 -1.612032 2.65116 | |

select | ||

married | .4461775 .0681721 .003045 .4456493 .3178532 .5785857 | |

children | .4401305 .0255465 .001156 .4402145 .3911135 .4903804 | |

education | .0559983 .0104231 .000484 .0556755 .0360289 .076662 | |

age | .0364752 .0042497 .000248 .0362858 .0280584 .0449843 | |

_cons | -2.494424 .18976 .011327 -2.498414 -2.861266 -2.114334 | |

athrho | .868392 .099374 .005961 .8699977 .6785641 1.062718 | |

lnsigma | 1.793428 .0269513 .001457 1.793226 1.740569 1.846779 | |

Unlike **heckman**, **bayes: heckman** reports the ancillary parameters
only in the estimation metric. We can use **bayesstats summary** to obtain
the parameters in the original metric.

.bayesstats summary (rho:1-2/(exp(2*{athrho})+1)) (sigma:exp({lnsigma}))Posterior summary statistics MCMC sample size = 10,000 rho : 1-2/(exp(2*{athrho})+1) sigma : exp({lnsigma})

Equal-tailed | ||

Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||

rho | .6970522 .0510145 .003071 .701373 .5905851 .7867018 | |

sigma | 6.012205 .1621422 .008761 6.008807 5.700587 6.339366 | |

Parameter **rho** is a correlation coefficient that measures the dependence
between the outcome and participation models. If **rho** is zero, the two
models are independent and can be analyzed separately. In other words, there
is no sample selection, and we can model the wages using only the sample of
women who work without introducing any bias in our results. In our example,
**rho** is estimated to be between 0.59 and 0.79 with a probability of
0.95, so the decision to work is related to the wages in this example.

We can test for sample selection formally by using, for example, Bayes
factors. A Bayes factor of two models is simply the ratio of their marginal
likelihoods. The larger the value of the marginal likelihood, the better the
model fits the data. To test for sample selection, we can compare the marginal
likelihoods of the current model and of the model with **rho** equal to
zero.

First, we store the current Bayesian estimation results from the sample-selection model.

.bayes, saving(heckman_mcmc).estimates store heckman

Next, we fit a model that assumes no sample selection. When **rho** equals
zero, **{athrho}** also equals zero. So we specify a strong prior
saturated at zero for parameter **{athrho}**.

.bayes, prior({athrho}, normal(0,1e-4)) saving(nosel_mcmc): heckman wage educ age, select(married children educ age)

Model summary | ||

Likelihood: | ||

wage ~ heckman(xb_wage,xb_select,{athrho} {lnsigma}) | ||

Priors: | ||

{wage:education age _cons} ~ normal(0,10000) (1) | ||

{select:married children education age _cons} ~ normal(0,10000) (2) | ||

{athrho} ~ normal(0,1e-4) | ||

{lnsigma} ~ normal(0,10000) | ||

(1) Parameters are elements of the linear form xb_wage. (2) Parameters are elements of the linear form xb_select. |

Equal-tailed | ||

Mean Std. Dev. MCSE Median [95% Cred. Interval] | ||

wage | ||

education | .8981219 .0509913 .001578 .8973616 .8013416 1.000497 | |

age | .1477784 .01854 .00066 .1477496 .1115628 .1850257 | |

_cons | 5.994764 .890318 .030657 6.014622 4.150738 7.658942 | |

select | ||

married | .4351031 .0748102 .003577 .4377313 .2821176 .5752786 | |

children | .4501657 .0285028 .001045 .4492015 .3937091 .5048498 | |

education | .0584037 .0110582 .000524 .0579573 .0370387 .0814287 | |

age | .034779 .0043677 .00022 .0348894 .0259916 .043139 | |

_cons | -2.47607 .1962162 .009818 -2.467739 -2.862694 -2.10733 | |

athrho | .0062804 .010209 .00023 .0062746 -.014139 .0261746 | |

lnsigma | 1.69586 .019056 .000386 1.695649 1.65948 1.734115 | |

We now use **bayesstats ic** to obtain the Bayes factor of the two models.

.bayesstats ic heckman noselBayesian information criteria

DIC log(ML) log(BF) | ||

heckman | 10376.05 -5260.202 . | |

nosel | 10435.29 -5283.025 -22.82221 | |

The value of the log-Bayes factor of -23 indicates a very strong preference
for the sample-selection model **heckman** and thus for the presence of
sample selection in these data.

Learn more about the general features of the bayes prefix.

Learn more about Stata's Bayesian analysis features.

Read more about the **bayes** prefix and Bayesian analysis in the *Stata Bayesian Analysis Reference Manual*.