Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: mlogit estimation p-value problem
Andreas Chouliaras <firstname.lastname@example.org>
Re: st: mlogit estimation p-value problem
Tue, 11 Jun 2013 13:54:25 +0200
I will try to address David's questions:
First of all, why am I not using an ordered logit? Well, I believe
that in an ordered logit, the odds of getting a value for the count
equal to 5, instead of 4, are equivalent to the odds of observing 3
instead of 2. With such a constraint the estimates will be less less
efficient if the odds are not proportional. And I don't think I have a
strong reason why the odds should be proportional in my study.
I have a total of 8 groups (bcE, bcW, bcP, bcC, tcE, tcW, tcP, tcC).
The total observations are 1877. The groups starting with b (bcE, bcW,
bcP, bcC) are examined separately than the groups starting with t
(tcE, tcW, tcP, tcC). More specifically, my primary interest is to see
the interactions of bcP with the other groups starting from b (bcE,
bcW, bcC), and the same for tcP for groups starting with "t" (tcE,
tcW, tcC). Thus, I use bcP as an independent variable in 3 different
cases: bcE as the dependent, bcW as the dependent, bcC as the
dependent. Also, I use tcP as an independent variable in 3 cases: tcE
as the dependent, tcW as the dependent, tcC as the dependent.
Thus, I am primarily interested in the results of 6 multinomial logit models:
A: For the "b" groups
1) mlogit bcE vE eRE iRE bE bcP
2) mlogit bcW vW eRW iRW bW bcP
3) mlogit bcC vC eRC iRC bC bcP
B: For the "t" groups
4) mlogit tcE vE eRE iRE bE tcP
5) mlogit tcW vW eRW iRW bW tcP
6) mlogit tcC vC eRC iRC bC tcP
For these 6 models, I believe there is
a problem for the results of model 2, outcome 5.
Now, regarding the observations of outcome 5:
bcW has 4 observations for outcome 5, tcW has 3 observations for
outcome 5. I am putting the numbers of observations for outcome 5 for
the other groups
bcE : 18 tcE : 11
bcP : 17 tcP : 12
bcC : 41 tcC : 31
So for the problematic case of model 2, the dependent variable (bcW)
has 4 observations for outcome 5, while bcP has 17. Maybe the problem
is that bcW has only 4 observations as you mentioned. But on the other
hand tcW has only 3 observations for outcome 5 as well.
Furthermore, when I drop the bcP from model 2, the coefficients are
significant for 4 of the other 5 variables.
How do you think I should deal with these issues?
On Mon, Jun 10, 2013 at 12:14 PM, David Hoaglin <email@example.com> wrote:
> Dear Andreas,
> It is difficult to give good suggestions without seeing your Stata
> commands and output.
> I am puzzled by your analysis. If the dependent variable is actually
> a count (which can take values of 0 through 5), that would make the
> six outcome categories ordered. A multinomial logistic regression
> treats the outcome categories as unordered. You could consider an
> ordinal logistic regression, but that would not use the equal spacing
> of the count.
> You did not mention the number of groups or the total number of
> observations. Perhaps the outcome of 5 has too few observations.
> David Hoaglin
> On Mon, Jun 10, 2013 at 4:42 AM, Andreas Chouliaras <firstname.lastname@example.org> wrote:
>> Dear all,
>> I estimate an mlogit for a discrete dependent variable that takes the
>> values 0 to 5 (count variable). I have different groups thus I have
>> different count variables for each group. At a later point I want to
>> see whether there is some relationship between the dependent variables
>> of the different groups, and I use the count variable of one group as
>> an independent variable for another group. But this causes the
>> following problem: for outcome 5, all coefficients are insignificant.
>> If I remove the count variable, most of the coefficients are
>> significant, so I guess there must be something wrong with the
>> inclusion of the count variable. My initial guess was
>> multicollinearity, but using the "collin" command I don't get any very
>> high VIFs. Any idea what might be the reason?
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: