Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problem dealing with predicted probabilities from mixlogit


From   esther adler <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: Problem dealing with predicted probabilities from mixlogit
Date   Sat, 10 Sep 2011 22:59:50 +0100 (BST)

(Forgot to put the subject line, corrected)

Hi Cam, 

thanks, but yes, I know the panel mixlogit provides a better fit. I am sorry but that does not answer my question.

Glad to see however that at least in your case you go the whole question, it seems like, when I go on the statalist archive, it is truncated in the middle.

Ester


Hi Ester, So you are trying to maximize the loglikelihood of someone telling you something that you don't already know? :)  Let me try... in the mixed multinomial logit model without the panel specification, you are assuming preferences vary across persons but are invariant across the series of choice tasks for the same individual. Once you add the panel specification, you allow for within-person heterogeneity across choice tasks, and hence, the fit improves. 

So yes, the panel specification is more appropriate. See: Hess, S., & Rose, J.M. (2009). Allowing for intra-respondent variations in coefficients estimated on repeated choice data. Transportation Research, Part B, 43(6), 708-719. 

Hope this helps, 
Cam 

> Date: Sat, 10 Sep 2011 19:11:39 +0100
> From: [email protected]
> Subject: st: Problem dealing with predicted probabilities from mixlogit
> To: [email protected]
> 
> I am using the mixlogit command by Arne Risa Hole. It includes traindata.dta and is explained in http://www.stata-journal.com/article.html?article=st0133 > 
> When I run
> 
> use traindata.dta
> global randvars "contract local wknown tod seasonal"
> mixlogit y price, rand($randvars) group(gid) nrep(50)
> 
> mixlpred p
> gen  lnp01=y*ln(p)
> egen LL=total(lnp01)
> 
> I obtain the same LL as that reported by the package, but if I want to take account of the panel nature of the data and run:
> 
> use traindata.dta
> global randvars "contract local wknown tod seasonal"
> mixlogit y price, rand($randvars) group(gid) id(pid) nrep(50)
> mixlpred p
> gen  lnp01=y*ln(p)
> egen LL=total(lnp01)
> 
> I get LL= -1356.44 while the output of the regressions gives -1126.1653.
> 
> From what I have gathered, this is due to the panel nature of the data, though frankly, I have *no idea* how this explains the difference (the constant terms in the individual utilities cancel out in a choice model, so the explanation is not that the individual constant term cannot be estimated). Note that I have read quite a bit on the topic, so just referring me the articles in the Stata Journal containing the formulas used in the model will not help me, since I do not understand how those explain the difference. A patient and detailed explanation would however be very useful, but at this point I have more or less given up on this!
> 
> Anyway, my issue is that I use the predicted probabilities to compute choice probabilities across alternatives in another dataset under different assumptions on the process consumers follow to select alternatives. I compare the individual LL for each choice process and then assign consumers to different types depending on the choice process that maximises their LL.
> 
> My issue is therefore that the individual LL I compute using the predicted probabilities from the mixlogit model are not correct. To drive the point further, it is not even useful for me to use gmnlpred of the gmnl command, by Gu, Hole and Knox (forthcoming in Stata Journal, available here: http://www.shef.ac.uk/economics/people/hole/stata.html), and compute the individual log-likelihood with it, because it will not be possible for me to compute similarly the individual log likelihood on the other dataset.
> 
> My question is then, should I therefore use the predicted probabilities from mixlogit with the panel specification rather than without the panel specification, just because the reported AIC of panel mixlogit is better? Or should I prefer the predicted probabilities from the non-panel version since the individual likelihood I would be calculating from this would at least be correct?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index