Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Coefficients of mlogit and predicted probabilities as generated by prtab and prgen


From   "Luis Ortiz" <luis.ortiz@upf.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Coefficients of mlogit and predicted probabilities as generated by prtab and prgen
Date   Fri, 17 Jul 2009 13:35:42 +0200

Hi,

I am puzzling from what I judge as diverging results (different sign) of
interaction terms in a multinomial logit model and predicted probabilities,
as generated through prtab and shown graphically through prgen and graph.

I am doing research on the returns of human capital investment in terms of
occupational attainment. For some theoretical reasons, my dependent variable
(occup_att_2, see below) is built as follows:

1. Managers
2. Professionals
3. Associate Professionals
4. Clerks,
5. Lower service and other occupations

?Clerks? is my reference category in the dependent variable.

I have applied a multinomial logit model to the sample of one of my national
cases of study. My data set is the result of merging different
cross-sectional surveys corresponding to eight different years; I am using
labour force surveys for up to eight years.

Since I am especially interested in looking at the TREND in the returns of
human capital investment, I have made interactions of the variable ?year?
(capturing the different years included in the data) and educational
attainment.

Here, I present the results of one my models. I have excluded the
coefficients corresponding to other indep vars I'm not so interested in.
 
. xi: mlogit occup_att_2 i.tert_ed*year_3 sex_2 mstatus_2 age national_2_2
national_2_3 tenure per
> m_2_2 perm_2_3, b(4) nolog
i.tert_ed         _Itert_ed_1-5       (naturally coded; _Itert_ed_3 omitted)
i.tert~d*year_3   _IterXyear__#       (coded as above)

Multinomial logistic regression                   Number of obs   =
525579
                                                  LR chi2(68)     =
432135.20
                                                  Prob > chi2     =
0.0000
Log likelihood = -414328.03                       Pseudo R2       =
0.3427

----------------------------------------------------------------------------
--
 occup_att_2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
Managers     |
 _Itert_ed_1 |   .8369161   .1164738     7.19   0.000     .6086317
1.065201
 _Itert_ed_2 |  -.5057224   .1448549    -3.49   0.000    -.7896329
-.221812
 _Itert_ed_4 |  -.1405644   .1441132    -0.98   0.329    -.4230211
.1418922
 _Itert_ed_5 |   .4043363   .1040787     3.88   0.000     .2003458
.6083269
      year_3 |   .0006106    .009962     0.06   0.951    -.0189145
.0201357
_IterXyear~1 |    .024218   .0122985     1.97   0.049     .0001133
.0483227
_IterXyear~2 |    .033616   .0151505     2.22   0.026     .0039216
.0633103
_IterXyear~4 |  -.0013059   .0143873    -0.09   0.928    -.0295046
.0268927
_IterXyear~5 |    .000611   .0112359     0.05   0.957    -.0214109
.0226329
       _cons |  -3.208653   .0965292   -33.24   0.000    -3.397847
-3.019459
-------------+--------------------------------------------------------------
--
Profession~s |
 _Itert_ed_1 |   3.870921   .1599636    24.20   0.000     3.557398
4.184444
 _Itert_ed_2 |  -.5270488   .1966783    -2.68   0.007    -.9125312
-.1415665
 _Itert_ed_4 |  -.3029058   .2470132    -1.23   0.220    -.7870428
.1812312
 _Itert_ed_5 |  -2.130236     .24443    -8.72   0.000     -2.60931
-1.651162
      year_3 |  -.0705025   .0171069    -4.12   0.000    -.1040313
-.0369736
_IterXyear~1 |    .058865   .0177997     3.31   0.001     .0239782
.0937517
_IterXyear~2 |   .1749493    .021033     8.32   0.000     .1337253
.2161732
_IterXyear~4 |   .0403727   .0249201     1.62   0.105    -.0084698
.0892152
_IterXyear~5 |   .0838841   .0262562     3.19   0.001     .0324228
.1353453
       _cons |  -3.485732   .1563276   -22.30   0.000    -3.792128
-3.179335
-------------+--------------------------------------------------------------
--
Associate ~s |
 _Itert_ed_1 |   .5250349    .088949     5.90   0.000     .3506981
.6993717
 _Itert_ed_2 |   .2204853   .0970563     2.27   0.023     .0302584
.4107123
 _Itert_ed_4 |    .219423   .1074958     2.04   0.041     .0087351
.4301109
 _Itert_ed_5 |  -.2276642   .0876818    -2.60   0.009    -.3995174
-.0558111
      year_3 |   .0345487   .0073366     4.71   0.000     .0201691
.0489282
_IterXyear~1 |   .0072084   .0093549     0.77   0.441    -.0111268
.0255436
_IterXyear~2 |   .0260399   .0102047     2.55   0.011     .0060391
.0460406
_IterXyear~4 |  -.0186982   .0106685    -1.75   0.080    -.0396081
.0022117
_IterXyear~5 |  -.0133634   .0093698    -1.43   0.154    -.0317279
.0050011
       _cons |  -.8378662   .0728109   -11.51   0.000     -.980573
-.6951594
-------------+--------------------------------------------------------------
--
Low servic~r |
 _Itert_ed_1 |  -.6625195   .0883424    -7.50   0.000    -.8356674
-.4893716
 _Itert_ed_2 |   .7419491   .0849377     8.74   0.000     .5754743
.9084238
 _Itert_ed_4 |   2.149201   .0871306    24.67   0.000     1.978429
2.319974
 _Itert_ed_5 |   2.418502   .0701088    34.50   0.000     2.281091
2.555912
      year_3 |   .0643842   .0063551    10.13   0.000     .0519284
.07684
_IterXyear~1 |  -.0171698   .0091566    -1.88   0.061    -.0351165
.0007769
_IterXyear~2 |  -.0339005   .0089674    -3.78   0.000    -.0514763
-.0163247
_IterXyear~4 |  -.1356851   .0087742   -15.46   0.000    -.1528822
-.1184881
_IterXyear~5 |  -.0473144   .0075397    -6.28   0.000    -.0620919
-.0325369
       _cons |   .2592752   .0623827     4.16   0.000     .1370073
.3815432
----------------------------------------------------------------------------
--
(occup_att_2==Clerks is the base outcome)

As you see, the coefficient of the interaction of time (year_3) and the
dummy variable corresponding to the highest educational attainment
(university degree) has a positive sign for the category 'Professionals' in
the dependent variable. A university degree not only seems to increase the
likelihood of being in this category, vis-à-vis the category of reference,
but also that time seems to have an effect increasing this likelihood
(versus the likelihood of increasing the possibility of finding yourself in
the reference category (?Clerks?).

For the sake of presenting graphically this trend, a) I have run another
multinomial logistic model excluding interactions of time and educational
attainment dummies. Please, note that I have JUST excluded the interactions
of time and educational attainment from the previous model; apart from that,
both models are identical.

b) I have used the prgen command to generate the predicted probabilities
corresponding to the variable 'year_3' time when the dummy variable
corresponding to university degree (_Itert_ed_1) is 1, the other dummies
corresponding to other educational attainment levels are 0 and (by default)
the rest of independent variables are kept to the mean; 

prgen year_3, x(_Itert_ed_1=1 _Itert_ed_2=0 _Itert_ed_4=0 _Itert_ed_5=0)
f(6) t(13) gen(univ)

and c) I have generated graph by means of...

graph twoway (scatter univp1 univp2 univp3 univp5 univp4 univx, connect(l l
l l l) xtitle(University) ytitle(probability))

Now, the trend devised by the graph (not show here) reveals a DECLINING
expected probability of being 'Professional' when you have a university
degree.

It corresponds to the decreasing predicted probabilities which appear when I
run the prtab command as follows

prtab _Itert_ed_1 year_3, x(_Itert_ed_2=0 _Itert_ed_4=0 _Itert_ed_5=0)

...I just show the predicted probabilities for the category 'Professionals'
in the dependent variable

mlogit: Predicted probabilities for occup_att_2

Predicted probability of outcome 2 (Professionals)

--------------------------------------------------------------------------
tert_ed== |                             year_3                            
1         |      6       7       8       9      10      11      12      13
----------+---------------------------------------------------------------
        0 | 0.0248  0.0240  0.0232  0.0225  0.0217  0.0210  0.0203  0.0197
        1 | 0.6741  0.6662  0.6580  0.6498  0.6414  0.6329  0.6242  0.6155
--------------------------------------------------------------------------


Now my question comes. I do not understand that such decreasing
probabilities appear when the interaction of year_3 and _Itert_ed_1 has
shown before (initial model) to be positive. How could I interpret this
discordance? How is it possible?

As suggested in the guidelines of Statalist, I have looked for help in the
Statalist itself, but I'm afraid I'm stuck with this problem.

I would very much appreciate your help on this.

In any case, my apologies for the query, if it results too long, and my
gratitude for your attention, if you have reached this point.


-.-.-.-.-.-.-.-
Luis Ortiz
Profesor Agregado
Departament de Ciencies Polítiques i Socials
Universitat Pompeu Fabra
Ramon Trias Fargas, 25-27
08005 Barcelona
 
Phone: +34-93-5422368
Fax: +34-93-5422372
http://www.upf.edu/dcpis/
http://sociodemo.upf.edu/
 
 



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index