Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: mlogit problem with "predict"


From   Sheela Athreya <athreya@neo.tamu.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: RE: mlogit problem with "predict"
Date   Tue, 07 Aug 2007 10:04:37 -0500

Maarten, many thanks for your helpful response. As soon as I posted the message I wished I had clarified what the variables pc1-pc4 were.

------Maarten Buis wrote:


Are pc1 - pc4 dummy variables for a single categorical variable (called pc)?
They are quantitative variables--principal component scores from my original data set, which consisted of a series of measurements taken on a set of human skulls that make up my cases (I'm an anthropologist, lest that sounds too morbid).

I ran PCA in order to reduce the variables and address the fact that measurements taken on the skull are not independent. Running PCA allows me to work with orthogonal data.


What you are looking for are (almost) empty cells, and (almost) perfect
predictions, or anything else that looks odd. It's pretty hard to
explain what it is that makes such a table look "odd", other than if
something is really wrong it is usually pretty obvious (though not
always).

I propose an incremental approach: See if you can solve the problem
by looking at -tab tax pc-, and report back to us. If that solves the
problem, great, if not, we'll try something else. (Notice that we are
living in different time zones, so it might take some time before I
get back to you, but somebody else on the Statalist might jump in)
The idea that maybe I have an (almost) perfect prediction occurred to me too, but I don't know how to investigate it. Perhaps there is an equivalent test to -tab tax pc- that allows me to look at quantitative variables for anything that is off?

Thanks again,
Sheela


*******************************
Sheela Athreya, Ph.D.
Texas A&M University
4352 TAMU
College Station, TX 77843
phone: 1-979-845-4785
fax: 1-979-845-4070



Hello,

I am using Stata 9 for the PC, and have run the following command:

mlogit tax pc1 pc2 pc3 pc4, vce(jackknife)

Where (tax) has five outcomes

The regression results seem to be fine, but when I then try to run "predict p1 p2 p3 p4 p5" to obtain posterior probabilities, I get the following:
"p1: 27 missing values generated" (note: there are more than 27 cases in the data set)

And the resulting posterior probabilities are completely off-- every value of p1 is either 0, 1 or missing (".")
And, for a handful of cases, the values of p1-p5 are *all* 0.

This also happens when I do not use the vce(jackknife) command, and when I use three instead of four independent variables.

I suspect something is wrong among my cases (or variables) such that maybe two of my groups are so highly correlated that I am getting these results? But I am not familiar enough with the principles of multinomial logit to know. Any help or advice would be much appreciated.

Many thanks,


*******************************
Sheela Athreya, Ph.D.
Texas A&M University
4352 TAMU
College Station, TX 77843
phone: 1-979-845-4785
fax: 1-979-845-4070


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index