[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: treatment of missing values in a matrix dissimilarity score

From	"Verkuilen, Jay" <[email protected]>
To	<[email protected]>
Subject	st: RE: treatment of missing values in a matrix dissimilarity score
Date	Mon, 15 Sep 2008 16:23:23 -0400

Zachary S Elkins WROTE:

(Small world, how's it going?) 

>However, some values in x1-x4 are missing.  Based on the results, it
appears that Stata treats missing values as if they were 1 and I don't
see how to modify that.  I'd like to calculate to calculate similarities
across only non-missing elements (the number of which will be different
for each pair, of course).<

This is a problem and I suspect most users of dissimilarity type
measures just sweep it under the rug. 

Listwise deletion's issues are just as bad here as they are in any other
area but absent anything else to do it makes sense for Stata to cut them
out. 

One thing you might consider is just trying multiple imputation on the
binary dataset using one of the methods that give you binary responses
back, e.g., simulating from the relevant Bernoulli distributions. Then
make several dissimilarity matrices and see how different they are. You
could always use "by" processing and then average the results of your
analyses back. Much depends on what you're intending to do with the
dissimilarity matrices. I'm guessing some kind of clustering or MDS type
application. 

JV

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: treatment of missing values in a matrix dissimilarity score
  - From: "Elkins, Zachary S" <[email protected]>

Prev by Date: Re: st: changing ddmmyyyy to yyyymm
Next by Date: st: Comparing datasets
Previous by thread: RE: st: treatment of missing values in a matrix dissimilarity score
Next by thread: Re: st: changing ddmmyyyy to yyyymm
Index(es):
- Date
- Thread