Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Clustering with missing values


From   Dan Weitzenfeld <dan.weitzenfeld@emsense.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Clustering with missing values
Date   Mon, 15 Jun 2009 12:40:11 -0500

To close the loop on my own question, I'm using -matrix dissimilarility..., gower- which uses a distance measure that allows for missing values, and then running -clustermat- on the resulting matrix.  



-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Dan Weitzenfeld
Sent: Friday, June 12, 2009 9:53 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: Clustering with missing values

Hi All,
I am doing a cluster analysis with a dataset that is sparse in a subset of variables.  I'm against the standard techniques - modeling the missing values, or replacing them with the mean - for theoretical reasons.  
A google search turned up a paper about using soft constraints - essentially using the sparse variables when they exist - and I'm wondering if there is a package/routine in Stata that implements this (or a similar) technique.
The paper is available at:
http://www.litech.org/~wkiri/Papers/wagstaff-missing-ifcs04.pdf
Thanks,
Dan





*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index