Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Matching procedure based on shortest distance given latitudes and longitudes |

Date |
Thu, 9 Feb 2012 20:37:34 +0100 |

Thanks to Robert for this smart and elegant way of dealing with this problem. However, if there are less observations in 2010 than in 2006, matchid=1 is generated for all 2006 id=1 observations - even though there is no shortest distance associated with this id. What would be an equally elegant way of solving this problem? Ruediger. 2012/2/9 Robert Picard <picard@netbox.com>: > As I mentioned to you a few days ago, you do not need a special > program to find the nearest neighbors. You can simply use -cross- to > form all pairwise combination of 2006 and 2010 observations, compute > all the distances, and then sort. I've added some code that does, I > think, the matching you describe. > > Robert > > *----------- begin example ------------- > version 12 > > set seed 1234 > > * save 2010 observations separately > clear > set obs 10 > gen id2 = _n > gen lat2 = 40 + runiform() * 5 > gen lon2 = 19 + runiform() * 5 > tempfile y2010 > save "`y2010'" > > * create 7 obs for 2006 > clear > local nobs2006 7 > set obs `nobs2006' > gen id1 = _n > gen lat1 = 40 + runiform() * 5 > gen lon1 = 19 + runiform() * 5 > > * form all pairwise combinations and compute distance > cross using "`y2010'" > * user-written program, to install: ssc install geodist > geodist lat1 lon1 lat2 lon2, gen(d) > > > gen d0 = d > gen matchid = . > gen matchd = . > > forvalues i = 1/`nobs2006' { > qui sum d > scalar mind = r(min) > qui sum id1 if d == mind > local bestid1 = r(min) > qui sum id2 if d == mind > local bestid2 = r(min) > qui replace matchid = `bestid2' if id1 == `bestid1' > qui replace matchd = mind if id1 == `bestid1' > qui replace d = . if id1 == `bestid1' | id2 == `bestid2' > dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind > } > > sort id1 d0 id2 > > *------------ end example -------------- > > > > 2012/2/9 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: >> Hello guys, >> >> I want to match observations in each observation in a given year with >> one observation in another year based on the shortest geographical >> distance between them given the latitudes and longitudes of each >> observation. >> >> I.e. the simplified structure of the dataset looks as follows: >> >> id year longitude latitude >> 1 2006 19.923 40.794 >> 2 2006 19.949 40.711 >> 1 2010 19.940 40.721 >> 2 2010 22.001 50.122 >> >> Hence, I would like to match each observation in 2006 with the one >> observation in 2010 that is closest AND that had not been matched to >> any observation in 2006 before. >> >> The previously discussed -nearstat- command (thanks to Wilner!) cannot >> be applied directly to this problem as it could match the same >> observation in 2010 with multiple observations in 2006 (i.e. in this >> example, the year 2010 observation with id 1 is closest to both >> observations in 2006 - and hence would be matched). >> >> Does anybody have an idea for a nice solution or is there even a >> command out there that would match based on distance given the >> latitudes and longitudes? >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**References**:**st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

- Prev by Date:
**st: RE: Adding Shaded Areas to Time Series Graph** - Next by Date:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Previous by thread:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Next by thread:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Index(es):