Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Class membership probabiliy and mlogit


From   Jonathan Sterne <Jonathan.Sterne@bristol.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   st: Class membership probabiliy and mlogit
Date   Fri, 11 May 2007 15:39:14 +0100

Dear statalisters

We have been fitting latent class models, the output of which is a set of posterior probabilities that each subject falls into one of six latent classes. We now want to use multinomial logistic regression (mlogit) to examine predictors of class membership.

One option is to assign each subject to her/his modal class (the class for which there is the highest probability of membership. However loses information (some subjects will have a high probability that they belong to a particular class, others will have relatively similar probabilities of membership of two or more classes.

As an alternative, we wish to fit multinomial logistic regression models using the class variable as the multinomial outcome and weighting the analysis using class membership probabilities.

We have stacked the data so we have multiple rows for each subject in the following form

ID Exposure Class Prob
1 1 1 0.1
1 1 2 0.1
1 1 3 0.4
1 1 4 0.3
1 1 5 0.05
1 1 6 0.05

'Prob' sums to one within subject and class repeats 1,2,3,4,5,6 through the whole dataset.

We weight using pweights [pw = prob]

Consequently, our model of choice has been:

xi: mlogit class xvars [pw = prob], rrr
(identical to xi: mlogit class xvars [iw = prob], rrr robust)

and we have also experimented with

xi: mlogit class xvars [pw = prob], rrr robust cluster(id)

which gives lower SE's, and

xi: mlogit class exposure [iweight = prob], rrr

which gives *higher* SE's than the pweight model without 'robust'

We would be grateful for advice on the following questions:

1. Is it appropriate to weight according to class membership probability (we are pretty convinced that it is)?

2. Does anyone have a recommendation as to which of the above model formulations gives theoretically appropriate standard errors?

Many thanks

Jonathan Sterne




*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index