Home / Products / Stata 15 / Alternative-specific mixed logit regression

This page announced the new features in Stata 15. Please see our Stata 19 page for the new features in Stata 19.

Alternative-specific mixed logit regression

What's this about

There are lots of ways of saying alternative-specific mixed logit regression. Three of them are

Mixed multinomial logit models
Mixed discrete choice models
Discrete choice models with random coefficients

Stata previously fit multinomial models. What is new is the mixed random-coefficient part. Mixed means random coefficients in this context.

Random coefficients are of special interest to those fitting these models because they are a way around multinomial models' IIA assumption. IIA stands for "independence of the irrelevant alternatives". If you have a choice among walking, public transportation, or a car and you choose walking, then once you have made your choice, the other alternatives should be irrelevant. If we took away one of the other alternatives, you would still choose walking, right? Maybe not. Human beings sometimes violate the IIA assumption.

Mathematically speaking, IIA makes alternatives independent after conditioning on covariates. If IIA is violated, then the alternatives would be correlated. Random coefficients allow the alternatives to be correlated.

Mixed logit models are often used in the context of random utility models and discrete choice analyses.

Stata's new asmixlogit command supports a variety of random coefficient distributions and allows for convenient inclusion of case-specific variables.

Highlights

Relaxes independence of irrelevant alternatives (IIA) assumption
Random coefficients from 6 distributions–normal, correlated normal, log normal, truncated normal, uniform, and triangular
Case-specific variables
Robust and cluster–robust standard errors
Support for complex survey data

Let's see it work

We want to analyze the choices clients make among services offered by a website design firm. The firm offers five plans:

Basic
Silver
Gold
Premium
Diamond

The design firm charges different prices and maintenance fees for each of the plans. Prices and maintenance fees vary across plans, of course, and they also vary client by client based on information clients provide.

We will model the probabilities of choosing plans as a function of

the prices and maintenance fees that clients are charged (variables price and mfee) and
client website traffic (variable traffic).

In the data, we have observations for each client and plan. Variable choice will be the dependent variable. It contains 0 or 1 depending on the plan each client chose.

We will fit the model using asmixlogit because we want to relax the IIA assumption. We think plans might be correlated because of how consumers perceive websites. Some clients buy the Diamond plan to make clear to their customer that they can afford it.

To fit our model, we will type

. asmixlogit choice mfee, random(price) casevars(traffic)
                          alternatives(plan) case(id)

Despite appearances, this model has three covariates: mfee, price, and traffic. mfee appears following the dependent variable choice, price appears in the random() option, and traffic appears in the casevars() option.

Regularly specified covariates have fixed coefficients.

random()-specified covariates have random coefficients.

casevars()-specified covariates have fixed coefficients but separately for each alternative.

Meanwhile, alternatives() and case() specify bureaucratic details. Variable id identifies the clients in the sample, and variable plan identifies the website design plan. Our data contain 250 clients and 5 plans, meaning 5×250 = 1,250 observations.

Here is the result of fitting the model:

(output omitted)

Alternative-specific mixed logit               Number of obs      =      1,250
Case variable: id                              Number of cases    =        250

Alternative variable: plan                     Alts per case: min =          5
                                                              avg =        5.0
                                                              max =          5
Integration sequence:      Hammersley
Integration points:                50             Wald chi2(6)    =     100.16
Log simulated likelihood = -289.85866             Prob > chi2     =     0.0000



      choice        Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

plan          
        mfee     -2.72992   .2746756    -9.94   0.000    -3.268274   -2.191565
       price    -1.083672   .3355608    -3.23   0.001    -1.741359   -.4259849

Normal        
    sd(price)     .8134459   .3653596                      .3372959    1.961762

Basic           (base alternative)

Silver        
     traffic    -.1388025   .1612408    -0.86   0.389    -.4548286    .1772236
       _cons    -.1323646   .8731152    -0.15   0.880    -1.843639     1.57891

Gold          
     traffic     .2812882   .1490178     1.89   0.059    -.0107814    .5733578
       _cons     .5150292   .7756453     0.66   0.507    -1.005208    2.035266

Premium       
     traffic     .2629573   .1706901     1.54   0.123    -.0715891    .5975038
       _cons    -.8781419   .9244092    -0.95   0.342    -2.689951    .9336669

Diamond       
     traffic     .4185438   .2048368     2.04   0.041      .017071    .8200166
       _cons    -1.399031   1.133266    -1.23   0.217    -3.620191    .8221302

LR test vs. fixed parameters: chibar2(01) =     2.15  Prob >= chibar2 = 0.0714

Start by looking up from the bottom. There is a section for each of the five plans. Plan Basic was treated as the base alternative.

Above, the plan output is the output across plans, labeled plan and Normal.

We have one random coefficient in this model. Its mean and standard deviation are the values in the coefficient column for price and sd(price). The coefficient distribution has mean -1.08 and standard deviation 0.81. Under the assumptions of the model, the coefficient for each client is drawn from this distribution. Thus we can think of the range of coefficients as being roughly -1.08-2*0.81 to -1.08+2*0.81, which is to say, -2.7 to 0.54. The majority of clients are less likely to purchase more expensive plans, but some are more likely to do that.

Tell me more

Read more about mixed logit models in the Stata Base Reference Manual; see [R] asmixlogit.

ORDER STATA UPGRADE NOW

Back to the highlights

This page announced the new features in Stata 15. Please see our Stata 19 page for the new features in Stata 19.

Alternative-specific mixed logit regression

What's this about

Highlights

Let's see it work

Tell me more

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies


choice		Coef. Std. Err. z P>\|z\| [95% Conf. Interval]

plan
mfee		-2.72992 .2746756 -9.94 0.000 -3.268274 -2.191565
price		-1.083672 .3355608 -3.23 0.001 -1.741359 -.4259849

Normal
sd(price)		.8134459 .3653596 .3372959 1.961762

Basic		(base alternative)

Silver
traffic		-.1388025 .1612408 -0.86 0.389 -.4548286 .1772236
_cons		-.1323646 .8731152 -0.15 0.880 -1.843639 1.57891

Gold
traffic		.2812882 .1490178 1.89 0.059 -.0107814 .5733578
_cons		.5150292 .7756453 0.66 0.507 -1.005208 2.035266

Premium
traffic		.2629573 .1706901 1.54 0.123 -.0715891 .5975038
_cons		-.8781419 .9244092 -0.95 0.342 -2.689951 .9336669

Diamond
traffic		.4185438 .2048368 2.04 0.041 .017071 .8200166
_cons		-1.399031 1.133266 -1.23 0.217 -3.620191 .8221302

LR test vs. fixed parameters: chibar2(01) = 2.15 Prob >= chibar2 = 0.0714

Stata/MP4 Annual License (download)

This page announced the new features in Stata 15. Please see our Stata 19 page for the new features in Stata 19.

Alternative-specific mixed logit regression

What's this about

Highlights

Let's see it work

Tell me more

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies