Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: mlogit estimation p-value problem

From   Richard Williams <>
Subject   Re: st: mlogit estimation p-value problem
Date   Tue, 11 Jun 2013 08:47:37 -0500

At 06:54 AM 6/11/2013, Andreas Chouliaras wrote:
Dear All,

I will try to address David's questions:

First of all, why am I not using an ordered logit? Well, I believe
that in an ordered logit, the odds of getting a value for the count
equal to 5, instead of 4, are equivalent to the odds of observing 3
instead of 2. With such a constraint the estimates will be less less
efficient if the odds are not proportional. And I don't think I have a
strong reason why the odds should be proportional in my study.

You can always test whether the proportional odds assumption is met. If some variables meet the assumption while others do not, a partial proportional odds model could be estimated with the user-written -gologit2-. mlogit models aren't very parsimonious so personally I'd rather see if some sort of ordinal model is viable.

I have a total of 8 groups (bcE, bcW, bcP, bcC, tcE, tcW, tcP, tcC).
The total observations are 1877. The groups starting with b (bcE, bcW,
bcP, bcC) are examined separately than the groups starting with t
(tcE, tcW, tcP, tcC). More specifically, my primary interest is to see
the interactions of bcP with the other groups starting from b (bcE,
bcW, bcC), and the same for tcP for groups starting with "t" (tcE,
tcW, tcC). Thus, I use bcP as an independent variable in 3 different
cases: bcE as the dependent, bcW as the dependent, bcC as the
dependent. Also, I use tcP as an independent variable in 3 cases: tcE
as the dependent, tcW as the dependent, tcC as the dependent.

Thus, I am primarily interested in the results of 6 multinomial logit models:

A: For the "b" groups
1) mlogit bcE vE eRE iRE bE bcP
2) mlogit bcW vW eRW iRW bW bcP
3) mlogit bcC vC eRC iRC bC bcP

B: For the "t" groups
4) mlogit tcE vE eRE iRE bE tcP
5) mlogit tcW vW eRW iRW bW tcP
6) mlogit tcC vC eRC iRC bC tcP

For these 6 models, I believe there is
a problem for the results of model 2, outcome 5.

Now, regarding the observations of outcome 5:

bcW has 4 observations for outcome 5, tcW has 3 observations for
outcome 5. I am putting the numbers of observations for outcome 5 for
the other groups

bcE : 18 tcE : 11
bcP : 17 tcP : 12
bcC : 41 tcC : 31

So for the problematic case of model 2, the dependent variable (bcW)
has 4 observations for outcome 5, while bcP has 17. Maybe the problem
is that bcW has only 4 observations as you mentioned. But on the other
hand tcW has only 3 observations for outcome 5 as well.

Furthermore, when I drop the bcP from model 2, the coefficients are
significant for 4 of the other 5 variables.

How do you think I should deal with these issues?

On Mon, Jun 10, 2013 at 12:14 PM, David Hoaglin <> wrote:
> Dear Andreas,
> It is difficult to give good suggestions without seeing your Stata
> commands and output.
> I am puzzled by your analysis.  If the dependent variable is actually
> a count (which can take values of 0 through 5), that would make the
> six outcome categories ordered.  A multinomial logistic regression
> treats the outcome categories as unordered.  You could consider an
> ordinal logistic regression, but that would not use the equal spacing
> of the count.
> You did not mention the number of groups or the total number of
> observations.  Perhaps the outcome of 5 has too few observations.
> David Hoaglin
> On Mon, Jun 10, 2013 at 4:42 AM, Andreas Chouliaras <> wrote:
>> Dear all,
>> I estimate an mlogit for a discrete dependent variable that takes the
>> values 0 to 5 (count variable). I have different groups thus I have
>> different count variables for each group. At a later point I want to
>> see whether there is some relationship between the dependent variables
>> of the different groups, and I use the count variable of one group as
>> an independent variable for another group. But this causes the
>> following problem: for outcome 5, all coefficients are insignificant.
>> If I remove the count variable, most of the coefficients are
>> significant, so I guess there must be something wrong with the
>> inclusion of the count variable. My initial guess was
>> multicollinearity, but using the "collin" command I don't get any very
>> high VIFs. Any idea what might be the reason?
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index