[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: AW: st: maximum number of outcomes in mlogit

From   Steven Samuels <>
Subject   Re: AW: st: maximum number of outcomes in mlogit
Date   Wed, 21 May 2008 09:50:08 -0400


I am not sure that this is a task for -mlogit- or -mprobit- . You have reduced the data to n=360 observations, each a vector of means. You grouped the 360 occupations, "by hand" into 54 groups and did a discriminant analysis, which confirms that the groups are separated. I agree that this is tautological.

So, I am not sure how to address your question (1) with this reduced set of data. You can approach question (2), perhaps, with a cluster analysis.

To answer(1), you might have held out a sample of the original individuals from each occupation; created occupational groups as best you could; then tested how accurately you would classify the individuals in the hold-out sample into the groups.


On May 21, 2008, at 4:08 AM, Tiemann, Michael wrote:

Hello again,

you are all too right, this is a lot of alternatives. Maybe I can clarify why this is so.
The alternatives are a set of groups of occupations, which are coded according to a german classification of occupations. So I definitely do not have a factorial structure. The information on the occupations has been aggregated from individual data (a microcensus file with more than 200.000 cases). What I now have is a matrix with information on independent variables (as means over those respondents included in each occupation) and the occupations themselves, which amount to around 360. Technically, there are now 360 cases (occupations) and 80 independent variables.
With these data the abovementioned groups of occupations were formed, though "by hand", following the idea that the independent variables can be used to correctly classify the occupations. What I want to do now is to find out whether this assumption really holds. I have already done a discriminant analyses, but that mainly tells me how well we have applied our theoretically derived aggregational rules. Of course, it also tells me that the independent variables can be used to discriminate the occupations in the way we did. But this must be the case -- it is somewhat tautologic.
With the mnlm I want to do two things: 1) find out whether the independent variables can be used to discriminate the occupations and 2) find out (which the -mlogtest, combine- would do nicely) whether there are gruops that might have been combined. Does this make anything clearer?

Meanwhile, I will have a look at -glm-, -ipf- and -mprobit-.

Thanks for your answers so far,


Michael Tiemann

Bundesinstitut für Berufsbildung (BIBB)
Arbeitsbereich 2.2 "Qualifikation, berufliche Integration und Erwerbstätigkeit"
Robert-Schuman-Platz 3
53175 Bonn

Telefon: 0228 / 107-1235
-----Ursprüngliche Nachricht-----
Von: [mailto:owner-] Im Auftrag von Richard Williams
Gesendet: Mittwoch, 21. Mai 2008 06:08
Betreff: RE: st: maximum number of outcomes in mlogit

At 08:15 PM 5/20/2008, jverkuilen wrote:

I want to estimate a multinomial logistic model with mlogit. But
my dependent variable holds 54 categories. Now, when I try to run
mlogit I am informed there are too many categories.>

Holy cow, that's a lot of alternatives. I am with Maarten on this,
unless you have a truly huge (and amazingly well-conditioned) dataset
you are going to be getting a lot of perfect prediction.

I agree. Having said that, I wonder if mprobit would work? I don't see a documented limit on the # of categories. Of course, you might have to wait a few months for it to finish running.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu

* For searches and help try:

* For searches and help try:

*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index