Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Chiara Mussida <cmussida@gmail.com> |
To | statalist <statalist@hsphsun2.harvard.edu> |
Subject | st: mlogit coefs |
Date | Tue, 17 Apr 2012 16:00:51 +0200 |
Dear All, I run a mlogit model for 9 labour market outcomes (transitions between the three states of employment unemployment and inactivity, therefore 6 transitions and 3 permanences), like: mlogit transition male_unmarried female_married female_unmarried age agesq ncomp child northw northe centre Ubenef edu1 edu2 health qu1nolav qu3nolav qu2nolav nopersincnolav noothersineq qu1ot qu2ot qu3ot if age>=15 & age<=64, b(3) the baseline category is the permanence in the state of unemployment. If I decide to run my mlogit only on the subsample of unemployment, thereby reducing the number of outcomes to 3: mlogit unemployedmale_unmarried female_married female_unmarried age agesq ncomp child northw northe centre Ubenef edu1 edu2 health qu1nolav qu3nolav qu2nolav nopersincnolav noothersineq qu1ot qu2ot qu3ot if age>=15 & age<=64, b(3) and keeping the permanence in the unemployment as baseline, I get different coefficients signs and significance for a dicrete amount of covariates, e.g. female_unmarried. My question is: I know that the first mlogit refer to a larger sample that includes all the labour force, whilst the second one only refer to the subsample of unemployed, but this is enough to justify the different behaviour of the coefs? The choice to use the "extended" mlogit (9 outcomes) is related to sample selection issues. SInce I have data to all the labour force it is better to consider all of them and to avoid an ex ante selection (and likely a selection bias). Thanks, Chiara * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/