Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: clogit for discrete choice experiment with multiple choice sets

From   Klaus Pforr <>
Subject   Re: st: clogit for discrete choice experiment with multiple choice sets
Date   Mon, 30 Jan 2012 10:56:44 +0100


Dear Hadji,

I forgot to mention this: As you speak of choice experiments, you probably randomized the respondents on the "choice sets". This means that don't need fixed effects for the individuals, as you can safely assume that the covariates and the individual heterogenity are independent. This means that you should be able to use the more efficient random effects models described in Train. I don't have much experience on how these models are estimated, but you come very far with GLM. A good read on this with Stata might be Rabe-Hesketh ( You could also start with a pooled mlogit, with robust or bootstrapped standard errors to correct for the correlation within inidividuals.

And to finally give you a short answer on the original question: I dont see how you can use clogit on your data without some major modifications. Its not simply a choice of the grouping variables.



Am 30.01.2012 10:36, schrieb Klaus Pforr:
Dear Hadji,

this seems to be an application for multilevel or panel multinomial logit. There is a fixed effects model by Chamberlain (1980). The fixed effects are in your case on the person level. Possible random effects solutions are discussed in Train (2009). The first model has not been implemented yet (cf. Allison 2009, p.44), but I'm am currently working on an ado for this model ( The latter models are complicated and can be estimated with GLM.

There is a also back door solution for the fixed effects estimator for small samples and short panel/small clusters (in your case the the number of experiments). Börsch-Supan applied the Chamberlain model on housing choices and rearranged the data in a way so that he could use the implemented clogit to estimate the model. The data organisation is the following: In a simplified version of your case you would have only 3 experiments (or panel time points in the chamberlain lingo) and 3 alternatives. Lets say you have the indiv 1 with this selection (this is example is purposely simple)
xp choice
1 1
2 2
3 3

When you look up the equation in the chamberlain model, you find the conditional likelihood of the prob to choose the time series that was chosen conditional ("i.e. divided by") the prob of all permutations of the chosen alternatives.

You look at all combination of choices, which have the same number of 1's, 2's and 3's (or in general all of your outcomes) for the specific individual. This set of permutation makes your set of alternatives:

Permutaion Was it chosen?
123 yes
132 no
213 no
231 no
312 no
321 no

After this reorganisation you run a clogit on the data with respondent as group, and have the multinomial logit with fixed effects. This is very cumbersome even your simple application, but it works. You also have to think about how to generate you independet variable for this to get the coefficents that you want.

Here is the literature:

Börsch-Supan, Axel. 1987. Econometric analysis of discrete choice: With applications on the demand for housing in the U.S. and West-Germany. Berlin et al.: Springer Verlag.

Börsch-Supan, Axel. 1990. Panel data analysis of housing choices. Regional science and urban economics 20: 65–82.

Börsch-Supan, Axel, und Henry O. Pollakowski. 1990. Estimating housing consumption adjustments from panel data. Journal of urban economics 27: 131–150.

Chamberlain, Gary. 1980. Analysis of Covariance with Qualitative Data. Review of Economic Studies 57: 225–238.

Train, Kenneth E. 2009. Discrete choice methods with simulation. 2. ed. Cambridge, MA et al.: Cambridge University Press.

I hope this helps



Am 28.01.2012 08:32, schrieb Hadji Cortez Jalotjot:

implemented a discrete choice experiment to model vehicle choice. In my questionnaire, I presented each respondents with 10 choice experiments
or choice sets with each choice set having 3 alternatives or choices.

The explanatory variables are the characteristics of the vehicles. With this, I am fitting a conditional logit model.

In my data set, dummy variables were used to represent the explanatory
variables. Since each choice experiment has 3 alternative options, each choice experiment corresponds to 3 rows of observations. So 10 choice
experiments per respondent X 3 alternative options per choice
experiments = 30 rows of observations per respondent. (sample data below shows only 3 choice experiments with
some of the explanatory variables for respondent 1)

respno choice_set choice var1a var1b var1c .. . .. none

1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1

1 2 0 0 1 0 0

1 2 1 1 0 0 0

1 2 0 0 0 0 1

1 3 0 0 0 1 0

1 3 0 1 0 0 0

1 3 1 0 0 0 1

For clogit to work, I must select a variable that will identify the grouping for which the software will run the analysis.

Now, for this kind of data in which respondents answered multiple choice sets (10 in my case), which should I used as a group?
Is it the respno or choice_set?

I am confused because if I use respno, Stata says multiple positve
outcomes in a group.  And the predicted probabilities is computed for
the whole 30 alternative options and not only for the 3 alternative
options per choice set.

But if I use the choice_set as the grouping and I extend the model to
include respondent characteristics (e.g. income), I may have problem
with fixed effects because for example choice_set 1 and choice_set 2 is
from the same respondent and therefore will have exactly the same

Any advice is appreciated.


*   For searches and help try:


Klaus Pforr
Universität Mannheim
D - 68131 Mannheim
Tel:  +49-621-181 2797
fax:  +49-621-181 2803

Besucheranschrift: A5, Raum A309

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index