Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"T.Randazzo" <tr81@kent.ac.uk> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Multinomial logit model with selection |

Date |
Mon, 5 Mar 2012 13:43:11 +0000 |

Dear Stas, Yes, it is correct! There is no selmlog in Stata but the package (selmlog13.ado) is dowlodable at http://www.parisschoolofeconomics.com/gurgand-marc/selmlog/selmlog13.html and it works! In their Monte Carlo experiment Bourguignon et al. (2007) find that the restriction on the correlation coefficients imposed in the original Durbin and McFadden (Econometrica 1984) can be waved to obtain more robust estimators. 1)I would like to compare both methods DMF(0) and DMF(1): how can I test if the assumption on the correlation coefficients (they sum up to zero) is correctly specify? (if the assumption is correct I should prefer the original model otherwise I should choose the more flexible one). Also, I would like to implement the model by myself: 1. Run a Multinomial logit model (type_HH is my dependent variable) 2. Calculate the inverse Mills ratio 3. Run a OLS regression where the dependent variable is expenditure in good i and include the mills ratios. Because I have 4 outcomes, after running the mlogit I have to create the predict probabilities for each outcome: predict p1, outcome(1) predict p2, outcome(2) predict p3, outcome(3) predict p4, outcome(4) Following the advice given by Mushfiq Mobarak (http://www.stata.com/statalist/archive/2003-04/msg00465.html) The way to calculate the mills’ ratios and apply the Dubin and McFadden (1984) is the following gen trnsp1=(p1*ln(p1))/(1-p1) gen trnsp2=(p2*ln(p2))/(1-p2) gen trnsp3=(p3*ln(p3))/(1-p3) gen trnsp4=(p4*ln(p3))/(1-p4) gen mills2= 3* ln(p2)+ trnsp1 + trnsp3 + trnsp4 gen mills3= 3* ln(p3)+ trnsp1 + trnsp2 + trnsp4 gen mills4= 3* ln(p4)+ trnsp1 + trnsp2 + trnsp3 2) What happen when I decide to apply the more flexible version of the Durbin and McFadden model? I should calculate 4 Mills ratios. Is it correct the following way? gen mills1= 4* ln(p1)+ trnsp2 + trnsp3 + trnsp4 gen mills2= 4* ln(p2)+ trnsp1 + trnsp3 + trnsp4 gen mills3= 4* ln(p3)+ trnsp1 + trnsp2 + trnsp4 gen mills4= 4* ln(p4)+ trnsp1 + trnsp2 + trnsp3 3) Again, is there a way to test which model is more appropriate? ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Stas Kolenikov [skolenik@gmail.com] Sent: 02 March 2012 18:08 To: statalist@hsphsun2.harvard.edu Subject: Re: st: Multinomial logit model with selection Teresa, cleanup issues in your post: 1. there is no -selmlog- in Stata world, as we know it. -findit selmlog- returns a reference to -svyselmlog- on SSC. If a package is not downloadable, it is nearly as good as non-existent. Without knowing what -selmlog- produces, it is impossible to say how to interpret its output. 2. References to the papers would be helpful. Especially if coupled with links to full text or to RePEc, at least. I can answer your question 2: I don't think any of the interpretation changes. You are doing corrections in a different way, that's all. What you called DMF(1) is more flexible, although not so internally consistent compared to DMF(0), but as far as I can recall Bourguignon's paper, it worked in a greater variety of settings. On Fri, Mar 2, 2012 at 11:45 AM, T.Randazzo <tr81@kent.ac.uk> wrote: > Dear Stata List, > I am trying to analyze how receiving remittances can affect the household expenditure behaviour in Senegal. > I have four types of household (HH_type) > HH_type: > 1. HH who do not receive remittances > 2. HH who receive remittances from national migrants > 3. HH who receive remittances from international migrants > 4. HH who receive remittances both from national and international migrants > > I would like to investigate if differences exist in some specific expenditure (food, durable goods, education, health...) > > The Model that I am trying to apply is a Multinomial logit model with selection as presented by Dubin and McFadden (1984) and revisited by Bourguignon, Fournier and Gurdand (2007). > > The original DMF’s model [DMF(0)] is based on two assumptions: linearity assumption between the error term in the outcome equation and the error term in the choice equation; correlation coefficients between the two error terms sum up to zero. > The DMF’ model [DMF(1)] proposed by Bourguignon et al (2007) relaxes the second assumption > I am using the Selmlog command in Stata10. > > When I consider DMF(0) I end up with 3 Mills’ ratio (M-1). > When I apply DMF(1) I end up with 4 Mills’ ratio > > 1) How can I test if the restriction on the correlation parameters is correct? > 2) Passing from 3 to 4 Mills’ ratios how does the interpretation of that relevant coefficients change? > Model DMF(1): > Gen health1= health > Replace health1=. if HH_type !=1 > selmlog health1 varlist, select (HH_type= varlist_m) dmf(1)bootstrap(100) gen(rh1_1) > > Gen health2= health > Replace health2=. if HH_type !=2 > selmlog health2 varlist, select (HH_type= varlist_m) dmf(1)bootstrap(100) gen(rh1_1) > > Considering expenditure on health, I have found that for HH_type=1 rh1_1, rh1_2 and rh1_4 are insignificant while rh1_3 is significant. For HH_type=2 only rh2_2 is significant. > 3) How should I interpret those results? > I tried to compare the results obtained from the command selmlog with the following prestige: > a) run a mlogit where the dependent variable is HH_type > b) calculate the mills ratios > > predict p1, outcome(1) > predict p2, outcome(2) > predict p3, outcome(3) > predict p4, outcome(4) > > gen trnsp1=(p1*ln(p1))/(1-p1) > gen trnsp2=(p2*ln(p2))/(1-p2) > gen trnsp3=(p3*ln(p3))/(1-p3) > gen trnsp4=(p4*ln(p3))/(1-p4) > > gen mills1= 4* ln(p1)+ trnsp2 + trnsp3 + trnsp4 > > gen mills2= 4* ln(p2)+ trnsp1 + trnsp3 + trnsp4 > > gen mills3= 4* ln(p3)+ trnsp1 + trnsp2 + trnsp4 > > gen mills4= 4* ln(p4)+ trnsp1 + trnsp2 + trnsp3 > > c) Add the Mills’ ratios to the second step equation (we are considering expenditure on health) > reg health1 varlist mills1 mills2 mills3 mills4 > reg health2 varlist mills1 mills2 mills3 mills4 > reg health3 varlist mills1 mills2 mills3 mills4 > reg health4 varlist mills1 mills2 mills3 mills4 > > 4) Does this prestige correspond to the one performed using the selmlog command? If it is, why don’t I get the same outcomes? > > Your help to understand the model better would be very appreciate, > > Sincerely, > > Teresa Randazzo > PhD candidate, University of Kent > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Multinomial logit model with selection***From:*"T.Randazzo" <tr81@kent.ac.uk>

**Re: st: Multinomial logit model with selection***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**st: post-hoc tests - comparing means across groups when variances are unequal** - Next by Date:
**RE: st: nbreg - problem with constant?** - Previous by thread:
**Re: st: Multinomial logit model with selection** - Next by thread:
**st: Repeated posts** - Index(es):