|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: clustering with a new dataset
<>
<>
Hi Walt,
I am a Stata beginner so I have little to offer you regarding
procedures available in Stata: maybe other listers can. However,
conceptually speaking, it sounds as though she wants to cross-validate
her model. A well-fitting cluster (or factor) model is tentative. It
requires post-hoc model validation. The researcher may use a random
sample from a validation holdout sample for cross-validation (i.e.,
within sample replication, which requires a large sample size). Though
even if a model fits the data well, it does not mean that it is the
correct model or even the best model to explain the phenomenon of
interest. There may be equivalent models that fit the sample data or
other data sources equally well. If the researcher uncovers equivalent
models, there is no statistical technique for discriminating among
them. Only on substantive knowledge about the phenomenon can the
researcher decide which equivalent model is best. The researcher may
judge a model "good" on both theoretical and statistical grounds, and
thus, provisionally accept the model. Cross-validation procedures on
different independent samples (seems like your case) from the same
population can enhance the utility of the model. You may compare the
models by examining the overall fit indices (e.g., chi-square, RMSEA)
and the significance of path coefficients to offer the client some
insight. I hope this helps.
Best,
Frank
On May 3, 2009, at 9:08 PM, Data Analytics Corp. wrote:
Hi,
I ran a cluster analysis last year for a client using "cluster ward
varlist" where the variables in varlist came from a survey. This
worked fine and the client was happy. This year, she returned with a
new dataset (same variables, just new values from a new survey) and
wants last year's clusters applied to this year's data. I can't see
how to do this - in fact it doesn't seem to make sense. Any
suggestions, or should I tell her that I can just rerun the old
commands and MAYBE the same clusters will appear?
Thanks,
Walt
--
________________________
Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
[email protected]
www.dataanalyticscorp.com
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/