Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Class membership probabiliy and mlogit

From   "Jon Heron (ALSPAC)" <>
Subject   Re: st: Class membership probabiliy and mlogit
Date   Mon, 14 May 2007 09:13:56 +0100

Hi Maarten,

unfortunately, our data don't appear suitable for this model
 - all but 3% of the cases have at least one probability which is
equal to zero

In particular, for the commonest response 'patterns' YYYYYYY and
NNNNNNN, we find ourselves with one probability practically equal
to one and very little else.  Only when we have a great deal of
uncertainly, e.g. for NYNNYNY will we get six non-zero class assignment


Dr Jon Heron
Statistics Team Leader
ALSPAC, Dept of Social Medicine
24 Tyndall Avenue
Bristol BS8 1TQ
Tel: 0117 3311616
Fax: 0117 3311704

--On 11 May 2007 20:24 +0100 Maarten buis <> wrote:

I have another suggestion. You could use the probabilities as the
dependent variable by estimating a -dirifit- model. See:

Hope this helps,

--- Jonathan Sterne <> wrote:

Dear statalisters

We have been fitting latent class models, the output of which is a
set of
posterior probabilities that each subject falls into one of six
classes. We now want to use multinomial logistic regression (mlogit)
examine predictors of class membership.

One option is to assign each subject to her/his modal class (the
class for
which there is the highest probability of membership. However loses
information (some subjects will have a high probability that they
belong to
a particular class, others will have relatively similar probabilities
membership of two or more classes.

As an alternative, we wish to fit multinomial logistic regression
using the class variable as the multinomial outcome and weighting the

analysis using class membership probabilities.

We have stacked the data so we have multiple rows for each subject in
following form

	ID     Exposure     Class     Prob
        1      1            1         0.1
        1      1            2         0.1
        1      1            3         0.4
        1      1            4         0.3
        1      1            5         0.05
        1      1            6         0.05

'Prob' sums to one within subject and class repeats 1,2,3,4,5,6
through the
whole dataset.

We weight using pweights [pw = prob]

Consequently, our model of choice has been:

xi: mlogit class xvars [pw = prob], rrr
(identical to xi: mlogit class xvars [iw = prob], rrr robust)

and we have also experimented with

xi: mlogit class xvars [pw = prob], rrr robust cluster(id)

which gives lower SE's, and

xi: mlogit class exposure [iweight = prob], rrr

which gives *higher* SE's than the pweight model without 'robust'

We would be grateful for advice on the following questions:

1. Is it appropriate to weight according to class membership
(we are pretty convinced that it is)?

2. Does anyone have a recommendation as to which of the above model
formulations gives theoretically appropriate standard errors?

Many thanks

Jonathan Sterne

*   For searches and help try:

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

Yahoo! Messenger - with free PC-PC calling and photo sharing. *
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index