Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: RE: Re: selmlog: question

From   "R.E. De Hoyos" <[email protected]>
To   <[email protected]>
Subject   st: Re: RE: Re: selmlog: question
Date   Fri, 21 Apr 2006 19:20:25 +0100


The mlogit model you are running is the same, in fact the probabilities predicted are also the same (the base category does not matter) however the way they are being parameterised to be included in the second-stage equation is NOT.

In the second-stage equation for "w1" the predicted probabilities are being parameterised taking into account the information that outcome 1 has been chosen (cprob1). This will be different from the selection components (cprob3) for different selected outcomes. That's why you have to estimate the model two times.


----- Original Message ----- From: "Rasmus J�rgensen" <[email protected]>
To: <[email protected]>
Sent: Thursday, April 20, 2006 1:55 PM
Subject: st: RE: Re: selmlog: question


Thanks for your reply.

There is, however, one thing that is puzzling me:

Why do you have to run selmlog twice in order to get cprob1 and cprob3? As I
see it, you're just running the same selection model once more and -- ex
ante -- I would expect that it makes no difference in the estimation of
correction terms.

However, it does!

_m3 from the first model with dep. var. w1 is much different from _m3 in the
second model. Do you have any ideas why this is the case?

I magine that the difference may be caused by the fact that mlogit uses
category 1 as base category, regardless of whether we model w1 or w3. Do you

Once again any advice are gratefully appreciated.



-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of R.E. De Hoyos
Sent: 12 April 2006 01:47
To: [email protected]
Subject: st: Re: selmlog: question


A way to do this is by generating the conditional probabilities (_m
in -selmlog-) for your two outcomes of interest. You can then use them in a
single wage equation. Say "w1" are the observed wages under outcome 1
(missing values otherwise) and "w3" are the observed wages under outcome 3
(missing values otherwise) as you specified the problem. Then:

selmlog w1 x1 x2, sel(outcome x1 x2 z1) gen(cprob1)
selmlog w3 x1 x2, sel(outcome x1 x2 z1) gen(cprob3)

The above model will allow for full wage parameter heterogeneity across
outcomes 1 and 3. Depending on your particular problem this might be the
best way to account for selection (allows for separate market equilibriums
and different payments for the unobserved characteristics determining
selection [cprob]). However if you want to impose the constraint of
homogeneity in parameters across the wage equation for outcomes 1 and 3 but
still treating them as different outcomes in your selection equation:

gen cpron_13=.
replace cprob_13 = cprob1 if outcome==1
replace cprob_13 = cprob3 if outcome==3

gen w_13=.
replace w_13 = w1 if outcome==1
replace w_13 = w3 if outcome==3

reg w_13 x1 x2 cprob_13

This last model will estimate the wage equation for outcomes 1 and 3
accounting for the unobserved characteristics that made the individuals
"choose" those particular outcomes (although the market payment for those
unobservables will be the same for both groups).

Notice that you will have to bootstrap the standard errors to account for
the heteroskedasticity present in the two-step procedure.

I hope this helps,
Rafael E. De Hoyos
Faculty of Economics
University of Cambridge

----- Original Message ----- From: "Rasmus Joergensen" <[email protected]>
To: <[email protected]>
Sent: Tuesday, April 11, 2006 8:26 PM
Subject: st: selmlog: question

Dear Statalist,

I'm trying to estimate the effect of self-employment experience. My
analysis considers the following selection rules:

1. Wage-employed in period t and period t+5

2. Self-employment spell between t and t+5.

This selection model thus consider 4 possible outcomes as illustrated

WE,t and WE,t+5
YES 1 2
SE spell
NO 3 4

One way to estimate this selection model is to use --selmlog--.

However, selmlog can only estimate the wage equation (the equation of
interest) for one outcome of the selection process. But I'm interested in
running a wage regression for outcome 1 and 3 (see above). In other words,

I'm trying to estimate a model that accounts for both sample selection and

endogenous treatment (the SE spell).

Does anyone have any advice how to correct --selmlog-- to estimate the
equation of interest for two outcomes of the selection process? Any
suggestions are very welcome.


Rasmus J�rgensen
Research Assistant
Centre for Economic and Business Research
E:< [email protected]

*   For searches and help try:

*   For searches and help try:

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index