Mixed logit regression

Order

Watch video demo

<- See Stata's other features

Highlights

Relaxes independence of irrelevant alternatives (IIA) assumption
Random coefficients from six distributions—normal, correlated normal, log normal, truncated normal, uniform, and triangular
Case-specific variables
Robust and cluster–robust standard errors
Support for complex survey data

Mixed logit models go by many names. A few of them are the following:

Mixed multinomial logit models
Mixed discrete choice models
Discrete choice models with random coefficients

And in earlier versions of Stata, we referred to them as alternative-specific mixed logit models.

Mixed logit models are unique among the models for choice data because they allow random coefficients.

Random coefficients are of special interest to those fitting these models because they are a way around multinomial models' IIA assumption. IIA stands for "independence of the irrelevant alternatives". If you have a choice among walking, public transportation, or a car and you choose walking, then once you have made your choice, the other alternatives should be irrelevant. If we took away one of the other alternatives, you would still choose walking, right? Maybe not. Human beings sometimes violate the IIA assumption.

Mathematically speaking, IIA makes alternatives independent after conditioning on covariates. If IIA is violated, then the alternatives would be correlated. Random coefficients allow the alternatives to be correlated.

Mixed logit models are often used in the context of random utility models and discrete choice analyses.

Stata's cmmixlogit command supports a variety of random coefficient distributions and allows for convenient inclusion of both alternative-specific and case-specific variables.

The cmxtmixlogit command fits these models for panel data.

Let's see it work

We want to analyze the choices clients make among services offered by a website design firm. The firm offers five plans:

Basic
Silver
Gold
Premium
Diamond

The design firm charges different prices and maintenance fees for each of the plans. Prices and maintenance fees vary across plans, of course, and they also vary client by client based on information clients provide.

We will model the probabilities of choosing plans as a function of

the prices and maintenance fees that clients are charged (variables price and mfee) and
client website traffic (variable traffic).

In the data, we have observations for each client and plan. Variable choice will be the dependent variable. It contains 0 or 1 depending on the plan each client chose.

We will fit the model using cmmixlogit because we want to relax the IIA assumption. We think plans might be correlated because of how consumers perceive websites. Some clients buy the Diamond plan to make clear to their customer that they can afford it.

First, we type

. cmset id plan

to specify the names of the variables recording client IDs and plan options. Our data contain 250 clients and 5 plans.

To fit our model, we will type

. cmmixlogit choice mfee, random(price) casevars(traffic)

Despite appearances, this model has three covariates: mfee, price, and traffic. mfee appears following the dependent variable choice, price appears in the random() option, and traffic appears in the casevars() option.

Regularly specified covariates have fixed coefficients.

random()-specified covariates have random coefficients.

casevars()-specified covariates have fixed coefficients but separately for each alternative.

Here is the result of fitting the model:

(output omitted)

Mixed logit choice model                       Number of obs      =      1,250
Case ID variable: id                           Number of cases    =        250

Alternatives variable: plan                    Alts per case: min =          5
                                                              avg =        5.0
                                                              max =          5
Integration sequence:      Hammersley
Integration points:               567             Wald chi2(6)    =      99.92
Log simulated-likelihood = -289.87694             Prob > chi2     =     0.0000



      choice   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

plan          
        mfee    -2.727259   .2747334    -9.93   0.000    -3.265727   -2.188792
       price    -1.082258    .335134    -3.23   0.001    -1.739109   -.4254076

/Normal       
    sd(price)     .8007662   .3691184                      .3244433     1.97639

Basic           (base alternative)

Silver        
     traffic    -.1384213   .1611304    -0.86   0.390    -.4542311    .1773884
       _cons    -.1292907   .8726359    -0.15   0.882    -1.839626    1.581044

Gold          
     traffic     .2809339   .1489504     1.89   0.059    -.0110035    .5728713
       _cons     .5136765   .7752536     0.66   0.508    -1.005793    2.033146

Premium       
     traffic     .2620055   .1704879     1.54   0.124    -.0721447    .5961556
       _cons    -.8803081   .9233871    -0.95   0.340    -2.690114    .9294974

Diamond       
     traffic     .4176888   .2044853     2.04   0.041      .016905    .8184726
       _cons    -1.397811   1.131834    -1.23   0.217    -3.616166    .8205426

LR test vs. fixed parameters: chibar2(01) =     2.11  Prob >= chibar2 = 0.0732

Start by looking up from the bottom. There is a section for each of the five plans. Plan Basic was treated as the base alternative.

Above, the plan output is the output across plans, labeled plan and /Normal.

We have one random coefficient in this model. Its mean and standard deviation are the values in the coefficient column for price and sd(price). The coefficient distribution has mean -1.08 and standard deviation 0.80. Under the assumptions of the model, the coefficient for each client is drawn from this distribution. Thus, we can think of the range of coefficients as being roughly -1.08-2*0.80 to -1.08+2*0.80, which is to say, -2.68 to 0.52. The majority of clients are less likely to purchase more expensive plans, but some are more likely to do that.

Tell me more

Read more about mixed logit models in the Stata Choice Models Reference Manual; see [CM] cmmixlogit. See [CM] cmxtmixlogit for mixed logit models fit to panel data.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


choice		Coefficient Std. err. z P>\|z\| [95% conf. interval]

plan
mfee		-2.727259 .2747334 -9.93 0.000 -3.265727 -2.188792
price		-1.082258 .335134 -3.23 0.001 -1.739109 -.4254076

/Normal
sd(price)		.8007662 .3691184 .3244433 1.97639

Basic		(base alternative)

Silver
traffic		-.1384213 .1611304 -0.86 0.390 -.4542311 .1773884
_cons		-.1292907 .8726359 -0.15 0.882 -1.839626 1.581044

Gold
traffic		.2809339 .1489504 1.89 0.059 -.0110035 .5728713
_cons		.5136765 .7752536 0.66 0.508 -1.005793 2.033146

Premium
traffic		.2620055 .1704879 1.54 0.124 -.0721447 .5961556
_cons		-.8803081 .9233871 -0.95 0.340 -2.690114 .9294974

Diamond
traffic		.4176888 .2044853 2.04 0.041 .016905 .8184726
_cons		-1.397811 1.131834 -1.23 0.217 -3.616166 .8205426

LR test vs. fixed parameters: chibar2(01) = 2.11 Prob >= chibar2 = 0.0732