Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Class membership probabiliy and mlogit


From   Gindo Tampubolon <Gindo.Tampubolon@manchester.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Class membership probabiliy and mlogit
Date   Tue, 15 May 2007 08:50:53 +0100

Dear Jon, Jonathan and others,

Bolck et al. 2004. describe one way to deal with this situation [in some areas,
it's known as a variant of latent structure model, MIMIC model etc.]
What you plan to do is what they call a three-step approach as opposed to a one
step approach of setting up the full likelihood of the measurement part [latent
class] and the structural part [multinomial logit].

Here's the reference:
Bolck A, Croon M, Hagenaars J.
Estimating Latent Structure Models with Categorical Variables: One-Step Versus
Three-Step Estimators, POLITICAL ANALYSIS, 12 (1): 3-27, Winter 2004

I tend to use the full maximum likelihood in another software and so don't have
any experience in constructing the matrix they suggest to correct the bias
inthe three step approach.

HTH,

Gindo



Date: Mon, 14 May 2007 09:13:56 +0100
From: "Jon Heron (ALSPAC)" <Jon.Heron@bristol.ac.uk>
Subject: Re: st: Class membership probabiliy and mlogit

Hi Maarten,


unfortunately, our data don't appear suitable for this model
 - all but 3% of the cases have at least one probability which is
equal to zero

In particular, for the commonest response 'patterns' YYYYYYY and
NNNNNNN, we find ourselves with one probability practically equal
to one and very little else.  Only when we have a great deal of
uncertainly, e.g. for NYNNYNY will we get six non-zero class assignment
probs.


cheers


Jon
- --------------------------------------------------
Dr Jon Heron
Statistics Team Leader
ALSPAC, Dept of Social Medicine
24 Tyndall Avenue
Bristol BS8 1TQ
Tel: 0117 3311616
Fax: 0117 3311704



- --On 11 May 2007 20:24 +0100 Maarten buis <maartenbuis@yahoo.co.uk> wrote:

I have another suggestion. You could use the probabilities as the
dependent variable by estimating a -dirifit- model. See:
http://home.fsw.vu.nl/m.buis/software/dirifit.html

Hope this helps,
Maarten

--- Jonathan Sterne <Jonathan.Sterne@bristol.ac.uk> wrote:

Dear statalisters

We have been fitting latent class models, the output of which is a
set of
posterior probabilities that each subject falls into one of six
latent
classes. We now want to use multinomial logistic regression (mlogit)
to
examine predictors of class membership.

One option is to assign each subject to her/his modal class (the
class for
which there is the highest probability of membership. However loses
information (some subjects will have a high probability that they
belong to
a particular class, others will have relatively similar probabilities
of
membership of two or more classes.

As an alternative, we wish to fit multinomial logistic regression
models
using the class variable as the multinomial outcome and weighting the

analysis using class membership probabilities.

We have stacked the data so we have multiple rows for each subject in
the
following form

	ID     Exposure     Class     Prob
        1      1            1         0.1
        1      1            2         0.1
        1      1            3         0.4
        1      1            4         0.3
        1      1            5         0.05
        1      1            6         0.05

'Prob' sums to one within subject and class repeats 1,2,3,4,5,6
through the
whole dataset.

We weight using pweights [pw = prob]

Consequently, our model of choice has been:

xi: mlogit class xvars [pw = prob], rrr
(identical to xi: mlogit class xvars [iw = prob], rrr robust)

and we have also experimented with

xi: mlogit class xvars [pw = prob], rrr robust cluster(id)

which gives lower SE's, and

xi: mlogit class exposure [iweight = prob], rrr

which gives *higher* SE's than the pweight model without 'robust'

We would be grateful for advice on the following questions:

1. Is it appropriate to weight according to class membership
probability
(we are pretty convinced that it is)?

2. Does anyone have a recommendation as to which of the above model
formulations gives theoretically appropriate standard errors?

Many thanks

Jonathan Sterne

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index