# st: computing statistical (mahalanobis) distance

 From Radu Ban To statalist@hsphsun2.harvard.edu Subject st: computing statistical (mahalanobis) distance Date Wed, 9 Nov 2005 12:21:52 -0500

```Dear listers,

I have the following data structure

pair_id  x1 x2.. xn y1 y2...yn
1
2
3
..

where, pair_id, identifies the pair, and xi are the coordinates of the
first point in the pair and yi are the coordinates of the second point
in the pair.

now i would like to compute for each pair, the statistical distance
between its two points (defined as sqrt[(x-y)'S^(-1)(x-y)], where S is
the covariance matrix of the vectors x and y).

one way that i think this can be done is to reshape x and y long, and
to compute the statistical distance for each pair separately and then
append together. sth along the lines of:

reshape long x y, i(pair_id) j(coordind)
save ...myfile, replace

forval i = 1/...{
keep if pair_id == `i'
matcorr x y, m(M`i') c
mkmat x, mat(X)
mkmat y, mat(Y)
matrix D = (X-Y)' *inv(M`i')*(X-Y)
svmat double D, name(d)
gen mahal = sqrt(d)
save ...myfile`i', replace
}

*and now append all the bits together.

This seems to me that it would take a long time to run given that i
have approx 50,000 pairs with 5 coordinates point each. I was
wondering if there is an easier/quicker way to do this.