Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Calculating Euclidean Distance |

Date |
Thu, 10 Jun 2010 09:59:13 -0400 |

Anthony Laverty <anthonylav@googlemail.com> : If you have N hospitals at T points in time, then you will have NTxN squared distances in your variables, and if they are doubles you may well run out of memory long before that, but if all you want is the nearest hospital, then you want one variable per hospital giving the identity of the nearest (over all months, you seem to suggest). You might also want to compute distance on a log scale, or some other metric. With more detail on your problem, you may get a better answer. Nevertheless, this is like what you asked for, I think: clear input str1 hospital time patients A 1 456 A 2 759 A 3 236 B 1 214 B 2 854 B 3 325 C 1 250 C 2 321 C 3 852 end egen g=group(hospital) su g, mean loc N=r(max) forv i=1/`N' { g double d`i'=. } levelsof time, loc(ts) fillin time g sort time g g long obs=_n qui foreach t of loc ts { su obs if time==`t', mean loc n0=r(min) loc n1=r(max) forv i=`n0'/`n1' { loc n=`i'-`n0'+1 replace d`n'=(patients-patients[`i'])^2 if inrange(_n,`n0',`n1') } } l, sepby(time) noo On Thu, Jun 10, 2010 at 5:08 AM, Anthony Laverty <anthonylav@googlemail.com> wrote: > Dear Statalist > > > > I have data on patient numbers at various hospitals and am trying to > calculate a new variable which is the Euclidean distance between one > specific hospital (say A) and all of the others, so that i can select > which hospitals had the most similar number of patients across all > months. The data is more or less arranged like this (although it has > a few more columns not of direct interest to this question): > > Hospital Time Patients > A 1 456 > A 2 759 > A 3 236 > B 1 214 > B 2 854 > B 3 325 > C 1 250 > C 2 321 > C 3 852 > > > > So, i want to cycle through each time period and calculate the > difference squared between hospital A and all of the other hospitals > individually as one new variable. > > > > Any suggestions greatly appreciated > > > > Anthony Laverty * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Calculating Euclidean Distance***From:*Anthony Laverty <anthonylav@googlemail.com>

**References**:**st: Calculating Euclidean Distance***From:*Anthony Laverty <anthonylav@googlemail.com>

- Prev by Date:
**st: Right skewed (positive) dependent variable** - Next by Date:
**st: AW: AW: One-sample ttests with estpost** - Previous by thread:
**st: Calculating Euclidean Distance** - Next by thread:
**Re: st: Calculating Euclidean Distance** - Index(es):