Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: computing statistical (mahalanobis) distance


From   Radu Ban <raduban@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: computing statistical (mahalanobis) distance
Date   Wed, 9 Nov 2005 12:21:52 -0500

Dear listers,

I have the following data structure

pair_id  x1 x2.. xn y1 y2...yn
1
2
3
..

where, pair_id, identifies the pair, and xi are the coordinates of the
first point in the pair and yi are the coordinates of the second point
in the pair.

now i would like to compute for each pair, the statistical distance
between its two points (defined as sqrt[(x-y)'S^(-1)(x-y)], where S is
the covariance matrix of the vectors x and y).

one way that i think this can be done is to reshape x and y long, and
to compute the statistical distance for each pair separately and then
append together. sth along the lines of:

reshape long x y, i(pair_id) j(coordind)
save ...myfile, replace

forval i = 1/...{
keep if pair_id == `i'
matcorr x y, m(M`i') c
mkmat x, mat(X)
mkmat y, mat(Y)
matrix D = (X-Y)' *inv(M`i')*(X-Y)
svmat double D, name(d)
gen mahal = sqrt(d)
save ...myfile`i', replace
}

*and now append all the bits together.

This seems to me that it would take a long time to run given that i
have approx 50,000 pairs with 5 coordinates point each. I was
wondering if there is an easier/quicker way to do this.

Any ideas/leads are much appreciated.

Thanks in advance,
Radu

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index