Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Multinomial logit selection correction using -selmlog-. Interpretation of the estimates?

From   Ewa Cukrowska <>
Subject   st: Multinomial logit selection correction using -selmlog-. Interpretation of the estimates?
Date   Tue, 17 Dec 2013 18:46:39 +0100

Dear Statalist users,

Recently I have come across some interpretation problems and I would
like to ask you for your help.

I estimated an updated version of the selection correction model by
Dubin and Mc Fadden (1984). The improvement of the initial model was
recently proposed by Bourguignon, Fournier and Gurgand (2007). The
model is based on the estimation of the multinomial logit selection
model, derivation of the correction terms and then their inclusion in
the outcome equation. For the estimation I used –selmlog- command
provided by Gurgand and Fournier. I run the model using the DMF(2)
specification from –selmlog- command, which is the Bourguignon’s et
al. modification of Dubin and Mc Fadden’s model.

I wanted to estimate wage equations for private, public and
self-employed and correct for their selection. In my model each
individual can choose from: 1) working in the public sector, 2)
working in the private sector, 3) working as self-employed and 4) not
working at all. In the end, I obtained the estimates for 3 wage
equations (for public, private and self-employed) that besides
standard demographic characteristics include selection correction
terms. The number of selection terms is equal to the number of
multinomial logit alternatives (in my case 4). Below I report the
estimates of the coefficients on the correction terms I obtained for
the public sector wage equation.

m_1 (private)                                 0.160***
m_2 (public)                                -0.080***
m_3 (self-employed)                  0.166***
m_4 (not working)                       0.316***

I would like to interpret the coefficients that relate to selection
terms but I am confused.

Firstly, according to Bourguignon, Fournier and Gurgand (2007) the
estimates on the correction terms represent sigma*rho_1 to
sigma*rho_s, where s stands for the number of alternatives from the
multinomial logit selection model and r_s represents the correlation
coefficient between the error terms in the wage equation and
s-alternative from the selection equation (page 179). So when the
estimated coefficient on the correction term is positive it indicates
that error terms are positively correlated. Referring to my estimates,
the positive coefficient on the private/self-employed/not-working
sector selection equation in the public sector wage equation implies
that some unobserved skills that cause an individual who is working in
the public sector to have higher probability to choose one of these
sectors are positively correlated with some unobserved skills that
influence the wage in the public sector. Does it mean that those
individuals, who are more able to choose these specific sectors have
so-called “better” unobserved skills? That would make sense, since the
correction variables (I mean the derived variables m_1, m_3 and m_4)
are negative and the higher the probability of choosing these sector
the lower the average wage for the public sector workers.
Above interpretation seems appealing to me, but I got very confused by
the interpretation that I have recently found in Dimova and Gang
(2007). They run the same model as I do using the same estimation
method. The interpretation they give is however the following: "For
instance, a positive bias correction coefficient related to the
private sector selection equation in the public sector wage equation
highlights higher wages of individuals in the public sector compared
to individuals taken at random, due to the allocation of people with
worse unobserved skills out of the public sector into the private
sector." (page 616). Which seems to be quite the opposite…

My questions are:

Is the interpretation of Dimova and Gang (2007) the right one (I
assume yes…) and if yes from where it stems from?

Below I post the references to the papers I mention:

Bourguignon, F., Fournier M. and Gurgand M. (2007) Selection Bias
Corrections Based On The Multinomial Logit Model: Monte Carlo
Comparisons. Journal of Economic Surveys 21(1): 174-205. (available
– need to login)

Dubin, J. A. and McFadden, D. L. (1984) An econometric analysis of
residential electronic appliance holdings and consumption.Econometrica
52: 345-362. (available at:

Dimova, R. and Gang, I. N. (2007) Self-selection and wages during
volatile transition. Journal of Comparative Economics 35(3): 612-629.
(full text for ScienceDirect subscribers:;
working paper is available at:

Selmlog command may be found here:

I would appreciate your help.

Kind regards,


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index