Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: cluster analysis with missing data


From   "Data Analytics Corp." <dataanalytics@earthlink.net>
To   Stata Listserve <statalist@hsphsun2.harvard.edu>
Subject   st: cluster analysis with missing data
Date   Wed, 29 Jul 2009 11:08:34 -0400

Hi Stata,

A client has a dataset from a survey in which consumers were shown a randomly selected set of 25 needs statements from a total of 152 statements. Each consumer saw only 25. The client want to cluster the 152 needs statements (i.e., 152 variables). Since the 25 were selected at random, this should be a Missing Completely at Random problem. But with each consumer responding to only 25, each record will have 127 missing values. I assume that Stata's clustering routines will do list-wise deletion so there should be no data available for clustering. Does anyone have any ideas how to handle this? Any suggestions? Can a similarity matrix still be created (how?) with so many missing data points?

Thanks,

Walt



--
________________________

Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
dataanalytics@earthlink.net
www.dataanalyticscorp.com

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index